Bug#1003020: openblas breaks hypre autopkgtest on armhf: test times out after 2:47h
Drew Parsons
dparsons at debian.org
Sun Jan 16 13:10:21 GMT 2022
On 2022-01-14 19:49, Paul Gevers wrote:
> Hi,
>
> On 14-01-2022 18:16, Drew Parsons wrote:
>> The passing machine was ci-worker-armel-01, but it was also the host
>> for failing tests. So it's not a simple as some difference between
>> different CPU variants of armhf.
>
> We only have one host for our armhf testing. (We did change host in
> the middle of 2021).
>
> If I can be of any help, I can try to run explicitly given commands in
> or extract information from the testbed of a (passing) test run (I'm
> not sure if I can do the same from a failing test as that may be
> shortcutted somehow, but I can try).
Hard to know what to suggest. autopkgtest is still passing fine in
chroot on abel (armhf porterbox).
The failure when it happens, seems to happen in the superlu test. Hard
to say if that's strictly causal, but it is consistent. superlu (
5.3.0+dfsg1-2) hasn't changed since October. There was an upload of
superlu-dist 7.2.0+dfsg1-2 around the same time as the openblas upload.
But hypre in testing (and unstable) passes fine with that superlu-dist
(hypre's superlu test actually tests superlu-dist. That's the -dslu_th
flag for the ./ij test program in src/test/TEST_superlu/sludist.jobs).
And superlu-dist is passing its own tests with openblas/0.3.19. Perhaps
the superlu-dist tests use threading in a different way to the hypre
superlu tests?
There was also an upload of glibc (libc6) 2.33 around the same time,
which provides pthreads. That makes me wonder if there was some change
in pthreads for armhf which leads to the instability.
It would be helpful to get hypre out of the NEW queue so we can test the
actual latest builds.
More information about the debian-science-maintainers
mailing list