Bug#982173: mpich breaks bagel autopkgtest: Internal error

Paul Gevers elbrus at debian.org
Sun Feb 7 06:44:12 GMT 2021


Source: mpich, bagel
Control: found -1 mpich/3.4.1-2
Control: found -1 bagel/1.2.2-1
Severity: serious
Tags: sid bullseye
X-Debbugs-CC: debian-ci at lists.debian.org
User: debian-ci at lists.debian.org
Usertags: breaks needs-update

Dear maintainer(s),

With a recent upload of mpich the autopkgtest of bagel fails in testing
when that autopkgtest is run with the binary packages of mpich from
unstable. It passes when run with only packages from testing. In tabular
form:

                       pass            fail
mpich                  from testing    3.4.1-2
bagel                  from testing    1.2.2-1
all others             from testing    from testing

I copied some of the output at the bottom of this report.

Currently this regression is blocking the migration of mpich to testing
[1]. Due to the nature of this issue, I filed this bug report against
both packages. Can you please investigate the situation and reassign the
bug to the right package?

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=mpich

https://ci.debian.net/data/autopkgtest/testing/amd64/b/bagel/10269200/log.gz

running test case 'hf_sto3g_fci_dist'... Assertion failed in file
./src/mpid/ch4/netmod/include/../ofi/ofi_impl.h at line 316:
MPIDI_OFI_global.max_order_war != 0
/lib/x86_64-linux-gnu/libmpich.so.12(MPL_backtrace_show+0x35)
[0x7ff101b7a5c5]
/lib/x86_64-linux-gnu/libmpich.so.12(+0x3d41f4) [0x7ff101af11f4]
/lib/x86_64-linux-gnu/libmpich.so.12(+0x2df929) [0x7ff1019fc929]
/lib/x86_64-linux-gnu/libmpich.so.12(MPI_Raccumulate+0xaf3) [0x7ff1019fdb43]
BAGEL(+0x1175449) [0x55bce38aa449]
BAGEL(+0x117556a) [0x55bce38aa56a]
BAGEL(+0x2dad12e) [0x55bce54e212e]
BAGEL(+0x2da77e7) [0x55bce54dc7e7]
BAGEL(+0x1287971) [0x55bce39bc971]
BAGEL(+0x128cb48) [0x55bce39c1b48]
BAGEL(+0x7984da) [0x55bce2ecd4da]
BAGEL(+0x6d00cb) [0x55bce2e050cb]
BAGEL(+0x6d0600) [0x55bce2e05600]
BAGEL(+0x630a79) [0x55bce2d65a79]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7ff101251d0a]
BAGEL(+0x6cd3ca) [0x55bce2e023ca]
Abort(1) on node 0: Internal error
Abort(806445583) on node 0 (rank 0 in comm 0): Fatal error in
PMPI_Finalize: Other MPI error, error stack:
PMPI_Finalize(189)..............: MPI_Finalize failed
PMPI_Finalize(149)..............:
MPID_Finalize(702)..............:
MPIDI_OFI_mpi_finalize_hook(827):
destroy_vni_context(1079).......: OFI domain close failed
(ofi_init.c:1079:destroy_vni_context:Device or resource busy)
FAILED.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/debian-science-maintainers/attachments/20210207/91db3272/attachment.sig>


More information about the debian-science-maintainers mailing list