Issue finding topology library on debian 11

stephane.gouache at orange.com stephane.gouache at orange.com
Mon Dec 18 14:09:00 GMT 2023


Dear maintainers,


I recently ran into an issue runnion our software (https://github.com/KhiopsML/khiops) on debian 11.
It is an AutoML suite that uses MPICH to parallelize the computations. It works fine on a variety of platforms and OSes, but with debian 11 it seems that MPICH installation prevents our software from working correctly.


In fact, after some work trying to clean up the installation, we found that the issue is occuring even with an installation of MPICH alone.

On a fresh debian (started with docker run debian:11)


docker run -it debian:11 bash

root at 67506d5ed8a1:/# apt-get update && apt install mpich

.....

Setting up libmpich-dev:amd64 (3.4.1-5~deb11u1) ...
update-alternatives: using /usr/include/x86_64-linux-gnu/mpich to provide /usr/include/x86_64-linux-gnu/mpi (mpi-x86_64-linux-gnu) in auto mode
Processing triggers for libc-bin (2.31-13+deb11u7) ...
root at 67506d5ed8a1:/#
root at 67506d5ed8a1:/# mpiexec  -bind-to hwthread -map-by core -n 5 ls   <-- of course problem occurs also with a real MPI program!
[mpiexec at 67506d5ed8a1] control_cb (pm/pmiserv/pmiserv_cb.c:206): assert (!closed) failed
[mpiexec at 67506d5ed8a1] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[mpiexec at 67506d5ed8a1] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:160): error waiting for event
[mpiexec at 67506d5ed8a1] main (ui/mpich/mpiexec.c:326): process manager error waiting for completion
root at 67506d5ed8a1:/#

Sometimes the mpiexec command fails with
[proxy:0:0 at 67506d5ed8a1] HYDT_topo_bind (tools/topo/topo.c:89): no topology library available

Indeed doing "mpiexec -h" reports no topology library being detected.
Any help moving this issue forward would be greatly appreciated.
Thanks a lot in advance, best regards,

Stéphane Gouache




____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-science-maintainers/attachments/20231218/eaf803cb/attachment.htm>


More information about the debian-science-maintainers mailing list