Bug#1068100: armci-mpi: autopkgtest spuriously fails

Samuel Thibault sthibault at debian.org
Sat Mar 30 15:55:11 GMT 2024


Source: armci-mpi
Version: 0.3.1~beta-7
Severity: serious
Tags: patch upstream fixed-upstream
Forwarded: https://github.com/pmodels/armci-mpi/pull/47
Justification: Prevents mpich migration

Hello,

The test_mpi_indexed_gets test is currently failing spuriously
in debian unstable due to an uninitialized value, e.g.
https://ci.debian.net/packages/a/armci-mpi/testing/amd64/44486187/ ,
preventing the newer mpich version from migrating to unstable.

I have forwarded a fix to upstream
https://github.com/pmodels/armci-mpi/pull/47
which is already merged.

Unless somebody objects, I'll NMU it.

Samuel

-- System Information:
Debian Release: trixie/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable-debug'), (500, 'unreleased'), (500, 'testing-debug'), (500, 'stable-security'), (500, 'stable-debug'), (500, 'oldstable-proposed-updates-debug'), (500, 'oldstable-proposed-updates'), (500, 'oldoldstable'), (500, 'buildd-unstable'), (500, 'unstable'), (500, 'stable'), (500, 'oldstable'), (1, 'experimental-debug'), (1, 'buildd-experimental'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386, arm64

Kernel: Linux 6.8.0 (SMP w/8 CPU threads; PREEMPT)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled
-------------- next part --------------
commit 80cc91748392aec9eced295eb531548a565dac0e
Author: Samuel Thibault <samuel.thibault at ens-lyon.org>
Date:   Sat Mar 30 16:35:11 2024 +0100

    tests/mpi/test_mpi_indexed_gets.c: Fix reception buffer initialization
    
    loc_buf was completely uninitialized, while the second and third loop nests
    are checking that the values are 1.0 + rank. With luck it would be zero, and
    thus the actual - expected > 1e-10 check would pass (since the difference is
    then negative). With less luck the check would spuriously fail for the part
    that is not overwritten by the MPI_Get.
    
    Signed-off-by: Samuel Thibault <samuel.thibault at ens-lyon.org>

commit 9c0f6b08ba706a7c2f3e74d325cfd2a010300108
Author: Samuel Thibault <samuel.thibault at ens-lyon.org>
Date:   Sat Mar 30 16:38:58 2024 +0100

    tests/mpi: Fix comparison of floating-double types
    
    In case the actual value is zero and the expected value is positive, the
    check would spuriously succeed.
    
    Signed-off-by: Samuel Thibault <samuel.thibault at ens-lyon.org>

commit cd001a46801fed9f406ea57238a131b0a0e063fe
Author: Samuel Thibault <samuel.thibault at ens-lyon.org>
Date:   Sat Mar 30 16:41:58 2024 +0100

    tests/mpi/test_mpi_indexed_gets.c: Strengthen test
    
    Now that it is fixed, we can make it actually perform strided accesses.
    
    Signed-off-by: Samuel Thibault <samuel.thibault at ens-lyon.org>

diff --git a/tests/mpi/test_mpi_indexed_gets.c b/tests/mpi/test_mpi_indexed_gets.c
index dc1bd9d..3570492 100644
--- a/tests/mpi/test_mpi_indexed_gets.c
+++ b/tests/mpi/test_mpi_indexed_gets.c
@@ -44,8 +44,10 @@ int main(int argc, char **argv) {
     if (rank == 0)
         printf("MPI RMA Strided Get Test:\n");
 
-    for (i = 0; i < XDIM*YDIM; i++)
+    for (i = 0; i < XDIM*YDIM; i++) {
         *(win_buf + i) = 1.0 + rank;
+        *(loc_buf + i) = 1.0 + rank;
+    }
 
     MPI_Win_create(win_buf, bufsize, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &buf_win);
 
diff --git a/tests/mpi/test_mpi_accs.c b/tests/mpi/test_mpi_accs.c
index ee9b0a9..730a632 100644
--- a/tests/mpi/test_mpi_accs.c
+++ b/tests/mpi/test_mpi_accs.c
@@ -55,7 +55,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < YDIM; j++) {
         const double actual   = *(buffer + i + j*XDIM);
         const double expected = (1.0 + rank) + (1.0 + ((rank+nranks-1)%nranks)) * (ITERATIONS);
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
diff --git a/tests/mpi/test_mpi_indexed_accs.c b/tests/mpi/test_mpi_indexed_accs.c
index 78c06bd..0fc3baa 100644
--- a/tests/mpi/test_mpi_indexed_accs.c
+++ b/tests/mpi/test_mpi_indexed_accs.c
@@ -97,7 +97,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = (1.0 + rank) + (1.0 + ((rank+nranks-1)%nranks)) * (ITERATIONS);
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -109,7 +109,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -121,7 +121,7 @@ int main(int argc, char **argv) {
       for (j = SUB_YDIM; j < YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
diff --git a/tests/mpi/test_mpi_indexed_gets.c b/tests/mpi/test_mpi_indexed_gets.c
index 3570492..99ac294 100644
--- a/tests/mpi/test_mpi_indexed_gets.c
+++ b/tests/mpi/test_mpi_indexed_gets.c
@@ -90,7 +90,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(loc_buf + i + j*XDIM);
         const double expected = (1.0 + peer);
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -102,7 +102,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(loc_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -114,7 +114,7 @@ int main(int argc, char **argv) {
       for (j = SUB_YDIM; j < YDIM; j++) {
         const double actual   = *(loc_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
diff --git a/tests/mpi/test_mpi_indexed_puts_gets.c b/tests/mpi/test_mpi_indexed_puts_gets.c
index d14988c..ea791dd 100644
--- a/tests/mpi/test_mpi_indexed_puts_gets.c
+++ b/tests/mpi/test_mpi_indexed_puts_gets.c
@@ -92,7 +92,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = (1.0 + ((rank+nranks-1)%nranks));
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -104,7 +104,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -116,7 +116,7 @@ int main(int argc, char **argv) {
       for (j = SUB_YDIM; j < YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
diff --git a/tests/mpi/test_mpi_subarray_accs.c b/tests/mpi/test_mpi_subarray_accs.c
index 0eb5397..ddc4b4f 100644
--- a/tests/mpi/test_mpi_subarray_accs.c
+++ b/tests/mpi/test_mpi_subarray_accs.c
@@ -90,7 +90,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = (1.0 + rank) + (1.0 + ((rank+nranks-1)%nranks)) * (ITERATIONS);
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -102,7 +102,7 @@ int main(int argc, char **argv) {
       for (j = 0; j < SUB_YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
@@ -114,7 +114,7 @@ int main(int argc, char **argv) {
       for (j = SUB_YDIM; j < YDIM; j++) {
         const double actual   = *(win_buf + i + j*XDIM);
         const double expected = 1.0 + rank;
-        if (actual - expected > 1e-10) {
+        if (fabs(actual - expected) > 1e-10) {
           printf("%d: Data validation failed at [%d, %d] expected=%f actual=%f\n",
               rank, j, i, expected, actual);
           errors++;
diff --git a/tests/mpi/test_mpi_indexed_gets.c b/tests/mpi/test_mpi_indexed_gets.c
index 99ac294..19c6158 100644
--- a/tests/mpi/test_mpi_indexed_gets.c
+++ b/tests/mpi/test_mpi_indexed_gets.c
@@ -19,7 +19,7 @@
 
 #define XDIM 8
 #define YDIM 1024
-#define SUB_XDIM 8
+#define SUB_XDIM 4
 #define SUB_YDIM 256
 
 int main(int argc, char **argv) {


More information about the debian-science-maintainers mailing list