Bug#1102612: mpich 4.3 not initialising multiple processes

Drew Parsons dparsons at debian.org
Fri Apr 11 16:14:39 BST 2025


Package: mpich
Followup-For: Bug #1102612

There is evidence that libucx0 might be the problem, or a problem,
in the ga (libglobalarrays-dev) build logs

https://buildd.debian.org/status/fetch.php?pkg=ga&arch=amd64&ver=5.9.1-1&stamp=1744381093&raw=0

e.g. FAIL: global/testing/pgtest
  copy is OK 

 > Checking scatter/gather (might be slow)... 
[sbuild:90178:0:90178] ucp_request.c:212  Assertion `ucs_async_check_owner_thread(&(worker)->async)' failed
==== backtrace (tid:  90178) ====
 0  /lib/x86_64-linux-gnu/libucs.so.0(ucs_handle_error+0x2bc) [0x7fc32aa2564c]
 1  /lib/x86_64-linux-gnu/libucs.so.0(ucs_fatal_error_message+0xb6) [0x7fc32aa231f6]
 2  /lib/x86_64-linux-gnu/libucs.so.0(ucs_fatal_error_format+0x11a) [0x7fc32aa2331a]
 3  /lib/x86_64-linux-gnu/libucp.so.0(ucp_request_release+0x1a7) [0x7fc32aab7487]
 4  /lib/x86_64-linux-gnu/libmpich.so.12(+0x349457) [0x7fc32b3ff457]
 5  /lib/x86_64-linux-gnu/libmpich.so.12(+0x349685) [0x7fc32b3ff685]
 6  /lib/x86_64-linux-gnu/libucp.so.0(ucp_am_rndv_process_rts+0x17f) [0x7fc32aa9fb9f]
 7  /lib/x86_64-linux-gnu/libucp.so.0(ucp_rndv_rts_handler+0xbd) [0x7fc32ab20b2d]
 8  /lib/x86_64-linux-gnu/libuct.so.0(+0x21fa9) [0x7fc32a93cfa9]
 9  /lib/x86_64-linux-gnu/libuct.so.0(uct_self_ep_am_bcopy+0x7e) [0x7fc32a93d6ce]
10  /lib/x86_64-linux-gnu/libucp.so.0(ucp_am_rndv_proto_progress+0x468) [0x7fc32aa8b1a8]
11  /lib/x86_64-linux-gnu/libucp.so.0(ucp_am_send_nbx+0x9aa) [0x7fc32aa9b01a]
12  /lib/x86_64-linux-gnu/libmpich.so.12(+0x17d593) [0x7fc32b233593]
13  /lib/x86_64-linux-gnu/libmpich.so.12(+0x17f378) [0x7fc32b235378]
14  /lib/x86_64-linux-gnu/libmpich.so.12(MPI_Get_accumulate+0x2fc) [0x7fc32b235e3c]
15  ./global/testing/pgtest.x(+0xcd40f) [0x55c644f5140f]
16  ./global/testing/pgtest.x(+0xd3693) [0x55c644f57693]
17  ./global/testing/pgtest.x(+0xd38ba) [0x55c644f578ba]
18  ./global/testing/pgtest.x(+0xd3aaf) [0x55c644f57aaf]
19  ./global/testing/pgtest.x(+0x929db) [0x55c644f169db]
20  ./global/testing/pgtest.x(+0xfa8e) [0x55c644e93a8e]
21  ./global/testing/pgtest.x(+0x111e5) [0x55c644e951e5]
22  ./global/testing/pgtest.x(+0x117c3) [0x55c644e957c3]
23  ./global/testing/pgtest.x(+0x4d05) [0x55c644e88d05]
24  /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7fc32aebcca8]
25  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7fc32aebcd65]
26  ./global/testing/pgtest.x(+0x4d31) [0x55c644e88d31]
=================================

Program received signal SIGABRT: Process abort signal.



More information about the debian-science-maintainers mailing list