[Debichem-devel] Bug#1006788: Bug#1006788: bagel: autopkgtest failure with new mpich.
Michael Banck
mbanck at debian.org
Sun Nov 27 09:46:23 GMT 2022
Hi,
On Wed, Aug 17, 2022 at 10:25:38PM +0200, Paul Gevers wrote:
> Control: severity -1 serious
> Control: retitle -1 autopkgtest fails on hosts with lots of RAM/cores
>
> Hi,
>
> On Sun, 3 Apr 2022 19:42:42 +0200 Michael Banck <mbanck at debian.org> wrote:
> > Hrm, it seems that test case passed now on the latest upload:
> > https://ci.debian.net/data/autopkgtest/unstable/amd64/b/bagel/20573831/log.gz
> >
> > |Get:14 http://deb.debian.org/debian unstable/main amd64 libmpich12 amd64 4.0.1-1 [4,924 kB]
> > [...]
> > |running test case 'he3_svp_asd-dmrg'... PASSED.
> >
> > So I'm a bit at a loss about what's going on here, perhaps that test
> > case really is just flakey.
>
> Yes, this test looks flaky (I came here because it was blocking glibc). The
> good news is however, it seems related to the host that runs the test. I.e.
> the test fails on our beefy amd64 host (ci-worker13) with 64 cores and 256GB
> RAM, but seems to pass on the others.
>
> The error on s390x is the same by the way (that has 10 cores and 32GB RAM).
I can reproduce this again on my developer (amd64) notebook.
If I downgrade mpich from 4.0.2 to 3.x, it passes fine:
|(unstable-amd64-sbuild)mba at curie:/tmp/autopkgtest.p02Sns/build.Osj/src$ dpkg -l | grep mpich
|ii libmpich12:amd64 3.4.1-5 amd64 Shared libraries for MPICH
|(unstable-amd64-sbuild)mba at curie:/tmp/autopkgtest.p02Sns/build.Osj/src$ ./debian/tests/testsuite.sh
|running test case 'he3_svp_asd-dmrg'... PASSED.
|All tests passed
|(unstable-amd64-sbuild)mba at curie:/tmp/autopkgtest.p02Sns/build.Osj/src$ dpkg -l | grep mpich
|ii libmpich12:amd64 4.0.2-2 amd64 Shared libraries for MPICH
|(unstable-amd64-sbuild)mba at curie:/tmp/autopkgtest.p02Sns/build.Osj/src$ ./debian/tests/testsuite.sh
|running test case 'he3_svp_asd-dmrg'... FAILED.
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * broadcast 0.00
| * dmrg block 0.00
| >> ** .. 0.17
|
| ===== Starting sweeps =====
|
| o convergence threshold: 1.0000e-08
| iter state sweep average sweep range dE average
| ERROR: EXCEPTION RAISED: dsyev/pdsyevd failed in Matrix
|1 tests failed
If I set BAGEL_NUM_THREADS as Graham suggests it also passes, so I'll
upload that now:
|(unstable-amd64-sbuild)mba at curie:/tmp/autopkgtest.p02Sns/build.Osj/src$ BAGEL_NUM_THREADS=4 ./debian/tests/testsuite.sh
|running test case 'he3_svp_asd-dmrg'... PASSED.
|All tests passed
Michael
More information about the Debichem-devel
mailing list