Bug#1102612: mpich 4.3 not initialising multiple processes
Drew Parsons
dparsons at debian.org
Fri Apr 11 14:48:31 BST 2025
Package: mpich
Version: 4.3.0-5
Followup-For: Bug #1102612
One data point might be relevant. The bug reported here manifests in
the amd64 and arm64 tests of armci-mpi.
But the armci-mpi tests are passing cleanly on the 32-bit arches, armhf and
the others.
I note that the build time mpich errors for armci-mpi, seen in
https://buildd.debian.org/status/fetch.php?pkg=armci-mpi&arch=amd64&ver=0.4-5&stamp=1744327219&raw=0
refer to UCX:
FAIL: benchmarks/ping-pong
==========================
[1744327153.607644] [sbuild:19884:0] sock.c:513 UCX WARN unable to read somaxconn value from /proc/sys/net/core/somaxconn file
[0] ARMCI Error: This benchmark should be run on at least two processes
That suggests to me the issue might be in one of the auxiliary
libraries e.g. libucx0. Perhaps it's a red herring. My trivial test
fails but does not trigger a UCX warning.
One other point, I mentioned that in my first run of the trivial test,
it "passed", emitting pmix warnings. I can reproduce that by running
the mpich test using mpiexec.openmpi instead of mpiexec.mpich. Of
course one should not launch it with the wrong mpi that way. I think I
must have used mpiexec instead of mpiexec.mpich in that first test.
More information about the debian-science-maintainers
mailing list