Bug#1100120: libopenmpi-dev: mpi4py spawn tests fail with MPI_ERR_UNKNOWN

Drew Parsons dparsons at debian.org
Wed Mar 12 02:27:08 GMT 2025


Package: libopenmpi-dev
Followup-For: Bug #1100120
Control: retitle -1 libopenmpi-dev: mpi4py spawn tests fail with MPI_ERR_UNKNOWN
Control: severity -1 normal

The OPAL ERROR or libucs segfault at the end of the tests can be
avoided by setting OMPI_MCA_btl_tcp_if_include=lo
(that's with OMPI_ not PRTE_ prefix, see Bug#1087784 for hypre)

Is it expected that we should need to be setting
OMPI_MCA_btl_tcp_if_include=lo in debian/tests (and debian/rules)?

But that still leaves the spawn test failures with MPI_ERR_UNKNOWN
(and the PMIx errors).  Updating the bug title so.

Looking into the history of spawn tests, spawn is known to give
problems.  Some problems with mpich were dealt with previously,
https://github.com/pmodels/mpich/issues/7073
https://github.com/mpi4py/mpi4py/issues/541

Indeed mpi4py has assumed that openmpi spawn tests fail, and has an
explicit skip checking the openmpi version in test_spawn.py

This skip condition was recently changed. Previously spawn tests were
skipped for all openmpi 5.0.x versions with
  @unittest.skipMPI('openmpi(>=5.0.0,<5.1.0)', skip_spawn())
Recently that was relaxed (https://github.com/mpi4py/mpi4py/pull/601)
to
  @unittest.skipMPI('openmpi(>=5.0.0,<5.0.7)', skip_spawn())
  
And debian's openmpi has just updated to 5.0.7. So mpi4py is now
running the spawn tests where previously it was skipping them,
resulting in the error reported here.

I'm not sure if mpi4py upstream meant the condition to say "<=5.0.7"
rather than "<5.0.7", or was wrong about spawn now working properly in
openmpi 5.0.7. But we can see that it is not working, so mpi4py will
want to continue skipping spawn tests for now.

In summary, this is a known issue in openmpi, so I'm downgrading the
severity to normal.



More information about the debian-science-maintainers mailing list