Bug#1006755: libarmci-mpi-dev: mpich test failures on s390x, mipsel

Drew Parsons dparsons at debian.org
Fri Mar 4 11:09:24 GMT 2022


Package: libarmci-mpi-dev
Version: 0.3.1~beta-5
Severity: important
Control: forwarded -1 https://github.com/pmodels/armci-mpi/issues/35

A handful of armci-mpi tests are failing with mpich on s390x, mipsel,
mips64el.  In this case the tests pass with openmpi. The issue is
raised upstream at https://github.com/pmodels/armci-mpi/issues/35

The failing tests are

FAIL: tests/mpi/test_mpi_dim
FAIL: tests/mpi/test_mpi_indexed_accs
FAIL: tests/mpi/test_mpi_indexed_gets
FAIL: tests/mpi/test_mpi_indexed_puts_gets
FAIL: tests/mpi/test_mpi_subarray_accs

They have much the same error message, e.g.

FAIL: tests/mpi/test_mpi_dim
============================

MPI test program (2 processes)

Testing strided gets and puts
(Only std output for process 0 is printed)

--------array[5]--------
local[1:3] -> remote[0:2] -> local[1:3] 
Assertion failed in file src/mpi/datatype/typerep/dataloop/looputil.c at line 815: *lengthp > 0
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x2b3d76) [0x3ff7e2b3d76]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x1fc89e) [0x3ff7e1fc89e]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x1c6774) [0x3ff7e1c6774]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x1cce1c) [0x3ff7e1cce1c]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x256b2e) [0x3ff7e256b2e]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x2598e6) [0x3ff7e2598e6]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x25be40) [0x3ff7e25be40]
/usr/lib/s390x-linux-gnu/libmpich.so.12(PMPI_Accumulate+0xa94) [0x3ff7e0f9044]
./tests/mpi/test_mpi_dim(+0x2980) [0x2aa1bf02980]
./tests/mpi/test_mpi_dim(main+0x6a) [0x2aa1bf0123a]
/lib/s390x-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x3ff7de24c5e]
./tests/mpi/test_mpi_dim(+0x1314) [0x2aa1bf01314]
internal ABORT - process 0
FAIL tests/mpi/test_mpi_dim (exit status: 1)



Discussing with upstream, they recommend running the mpich test/mpi
test suite when building or testing mpich. That might help catch some
issues on less common architectures.

As a workaround I'll configure debian/tests to "information only"
(drop set -e) on s390x with mpich, so as not to hold up the other
architectures.



-- System Information:
Debian Release: bookworm/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 5.16.0-3-amd64 (SMP w/8 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8), LANGUAGE=en_AU:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

-- no debconf information



More information about the debian-science-maintainers mailing list