Bug#1003020: openblas breaks hypre autopkgtest on armhf: test times out after 2:47h

Drew Parsons dparsons at debian.org
Sun Jan 16 13:10:21 GMT 2022


On 2022-01-14 19:49, Paul Gevers wrote:
> Hi,
> 
> On 14-01-2022 18:16, Drew Parsons wrote:
>> The passing machine was ci-worker-armel-01, but it was also the host 
>> for failing tests.  So it's not a simple as some difference between 
>> different CPU variants of armhf.
> 
> We only have one host for our armhf testing. (We did change host in
> the middle of 2021).
> 
> If I can be of any help, I can try to run explicitly given commands in
> or extract information from the testbed of a (passing) test run (I'm
> not sure if I can do the same from a failing test as that may be
> shortcutted somehow, but I can try).


Hard to know what to suggest.  autopkgtest is still passing fine in 
chroot on abel (armhf porterbox).

The failure when it happens, seems to happen in the superlu test. Hard 
to say if that's strictly causal, but it is consistent.  superlu ( 
5.3.0+dfsg1-2) hasn't changed since October. There was an upload of 
superlu-dist 7.2.0+dfsg1-2 around the same time as the openblas upload. 
But hypre in testing (and unstable) passes fine with that superlu-dist 
(hypre's superlu test actually tests superlu-dist. That's the -dslu_th 
flag for the ./ij test program in src/test/TEST_superlu/sludist.jobs). 
And superlu-dist is passing its own tests with openblas/0.3.19. Perhaps 
the superlu-dist tests use threading in a different way to the hypre 
superlu tests?

There was also an upload of glibc (libc6) 2.33 around the same time, 
which provides pthreads. That makes me wonder if there was some change 
in pthreads for armhf which leads to the instability.

It would be helpful to get hypre out of the NEW queue so we can test the 
actual latest builds.



More information about the debian-science-maintainers mailing list