Bug#1098576: mpi4py: FTBFS: testIMProbe (test_util_pkl5.TestMPISelf.testIMProbe) ... [c7a-large-1740141036:00000] *** An error occurred in PMIx Event Notification

Drew Parsons dparsons at debian.org
Sat Mar 22 19:50:15 GMT 2025


Source: pmix
Version: 5.0.7-1
Followup-For: Bug #1098576
Control: tags -1 ftbfs

There is some caprice in this bug.

I tested mpi4py again with the new versions pmix 5.0.7 and linux 6.12.19.

Locally (dpkg-buildpackage) the tests passed without error.
So I figured the problem might have been fixed in pmix in pmix 5.0.7 (or
perhaps in linux 6.12.19)

But preparing a new upload of mpi4py renabling the affected tests,
the error was triggered again (in a pbuilder chroot, using pdebuild):

testGetStatusAll (test_util_pkl5.TestMPISelf.testGetStatusAll) ... ok
testIBSendAndRecv (test_util_pkl5.TestMPISelf.testIBSendAndRecv) ... ok
testIMProbe (test_util_pkl5.TestMPISelf.testIMProbe) ... [sandy:00000] *** An error occurred in PMIx Event Notification
[sandy:00000] *** reported by process [2622291968,0]
[sandy:00000] *** on a NULL communicator
[sandy:00000] *** Unknown error (this should not happen!)
[sandy:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[sandy:00000] ***    and MPI will try to terminate your MPI job as well)
make[1]: *** [debian/rules:122: override_dh_auto_test] Error 1

and /var/log/syslog still shows a kernel error at the same time: 

2025-03-22T20:30:34.254999+01:00 sandy systemd[1]: Started run-p30170-i30470.scope - pbuilder_build_mpi4py_4.0.3-4.dsc.
2025-03-22T20:34:14.826512+01:00 sandy kernel: traps: prte[30569] general protection fault ip:7f7009b71119 sp:7f70055fa5f8 error:0 in libc.so.6[167119,7f7009a32000+165000]
2025-03-22T20:34:26.865613+01:00 sandy systemd[1]: run-p30170-i30470.scope: Deactivated successfully.



-- System Information:
Debian Release: trixie/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.12.19-amd64 (SMP w/8 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8), LANGUAGE=en_AU:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled



More information about the debian-science-maintainers mailing list