Bug#1101686: mpich: triggers test errors: MPII_init_gpu(51)....: gpu_init failed

Drew Parsons dparsons at debian.org
Tue Apr 1 11:37:41 BST 2025


Package: mpich
Version: 4.3.0-4
Followup-For: Bug #1101686
Control: affects -1 src:armci-mpi src:bagel src:eztrace

It looks like there might have been a problem with the 4.3.0-4 upload.

Changelog says it disabled GPU (HIP) support, but looks like the wrong
bug numbers might have been given (this bug is #1101686 not #1101868,
the 6s and 8s got swapped.  The other mpich/gpu bug is #1101628 not
#11101728)

In any case the error is still there with mpich 4.3.0-4.
Is something else needed to be done to disable GPU support?
(or alternatively, to manage GPU support.  It's nice to provide it in
principle, would need to be switched off by default, only activated
when explicitly requested.  Unless there's some patch that the client
packages should be applying).

The affected packages are
armci-mpi
bagel
eztrace (autopkgtest command3)

They all have short tests (<10 minutes).

bagel might be simplest (shortest) to test.

Only command3 is failing in eztrace, it will be short too (without the
distraction of the other tests) if you run it alone
  autopkgtest --test-name=command3

There is more to understand here. Why is it only these 3 packages
failing?  In eztrace, why is it only command3 that fails?

nwchem uses armci-mpi, it might only be a secondary failure. I expect
it will pass once armci-mpi is passing.



More information about the debian-science-maintainers mailing list