Bug#1100120: libopenmpi-dev: mpi4py spawn tests get OPAL ERROR: Unreachable in file ../../../ompi/runtime/ompi_mpi_finalize.c at line 286

Drew Parsons dparsons at debian.org
Tue Mar 11 13:52:42 GMT 2025


Package: libopenmpi-dev
Version: 5.0.7-1
Severity: serious
Justification: FTBFS (dependencies)

mpi4py build-time tests are showing problems in openmpi, with
buildtime tests failing. That's with mpi4py 4.0.3-1.
debci tests from its last build are still passing for now.

I'm assuming the bug is in openmpi, not mpi4py itself, since mpi4py is
passing tests with mpich (32 bit arches).

The first problem comes from PMIX,
  An error occurred in PMIx Event Notification
The error is reproducible,
cf. https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/mpi4py.html
    https://tests.reproducible-builds.org/debian/rbuild/unstable/amd64/mpi4py_4.0.3-1.rbuild.log.gz
It is triggered by test_util_pkl5, and also test_util_pool,
test_util_sync and test_win.
It is associated with a kernel general protection fault from prte.
That bug is reported in Bug#1098576, currently assigned to pmix though
I suspect it might be an openmpi issue.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1098576


Here I'm reporting a second problem: spawn is failing,
for instance:

ERROR: testNoArgs (test_spawn.TestSpawnSingleWorldMany.testNoArgs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/drew/projects/python/build/mpi4py/test/test_spawn.py", line 175, in testNoArgs
    child = self.COMM.Spawn(
        script, None, self.MAXPROCS,
        info=self.INFO, root=self.ROOT,
    )
  File "src/mpi4py/MPI.src/Comm.pyx", line 2544, in mpi4py.MPI.Intracomm.Spawn
    with nogil: CHKERR( MPI_Comm_spawn(
mpi4py.MPI.Exception: MPI_ERR_UNKNOWN: unknown error

----------------------------------------------------------------------
Ran 1857 tests in 84.632s

FAILED (errors=40, skipped=162)
[sandy:272668] OPAL ERROR: Unreachable in file ../../../ompi/runtime/ompi_mpi_finalize.c at line 286


I've marked this bug severity serious because of the message at the
end concerning the OPAL error in ompi_mpi_finalize.c (as well as the
MPI_ERR_UNKNOWN errors in the spawn tests).  If the OPAL message is a
red herring then please downgrade severity if appropriate.



We could just skip the failing tests in mpi4py (in fact I will for now),
but the underlying problem should be fixed in any case.

With mpi4py, I will upload 4.0.3-2 skipping the pmix failures, in
order to get a reproducible record of the spawn failure. After that I
will upload a release of mpi4py to skip the spawn tests, until the
issue is fixed in openmpi (or pmix).



-- System Information:
Debian Release: trixie/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.12.17-amd64 (SMP w/8 CPU threads; PREEMPT)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8), LANGUAGE=en_AU:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages libopenmpi-dev depends on:
ii  gfortran [gfortran-mod-15]     4:14.2.0-1
ii  gfortran-11 [gfortran-mod-15]  11.5.0-2
ii  gfortran-12 [gfortran-mod-15]  12.4.0-4
ii  gfortran-13 [gfortran-mod-15]  13.3.0-12
ii  gfortran-14 [gfortran-mod-15]  14.2.0-17
ii  libevent-dev                   2.1.12-stable-10+b1
ii  libhwloc-dev                   2.12.0-1
ii  libibverbs-dev                 56.0-2
ii  libjs-jquery                   3.6.1+dfsg+~3.5.14-1
ii  libjs-jquery-ui                1.13.2+dfsg-1
ii  libopenmpi40                   5.0.7-1
ii  libpmix-dev                    5.0.6-5
ii  openmpi-bin                    5.0.7-1
ii  openmpi-common                 5.0.7-1
ii  zlib1g-dev                     1:1.3.dfsg+really1.3.1-1+b1

Versions of packages libopenmpi-dev recommends:
ii  libcoarrays-openmpi-dev  2.10.2+ds-4

Versions of packages libopenmpi-dev suggests:
pn  openmpi-doc  <none>

-- no debconf information



More information about the debian-science-maintainers mailing list