[Python-modules-team] Bug#448530: bug reproduced

Paul Metcalfe paul.metcalfe+debian at gmail.com
Fri Dec 21 12:36:44 UTC 2007


I don't know that the linkage issues are the direct cause of this bug,
but they're pretty dodgy and in the right sort of area.  Or are is the
problem something really baroque like stack alignment?

The backtrace on my machine looks similar.

[snip]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7da18c0 (LWP 5659)]
0xb754bfb1 in ATL_dgemvT_a1_x1_b0_y1 () from /usr/lib/sse2/libf77blas.so.3
(gdb) bt
#0  0xb754bfb1 in ATL_dgemvT_a1_x1_b0_y1 () from /usr/lib/sse2/libf77blas.so.3
#1  0xb754463e in ATL_dgemv () from /usr/lib/sse2/libf77blas.so.3

I think we've worked out that the problem is manifesting in _dotblas,
the optimized version of numpy.dot that uses CBLAS.

The obvious question --- why is the crash in the middle of libf77blas
?  _dotblas should go straight to libcblas.so.

There is a filthy workaround... don't build _dotblas, in which case
numpy will fall back on slow internal code.

But I have yet to work out how to make distutils do anything it
doesn't want to (or not do anything it wants to).

Is there likely to be a new version of ATLAS & lapack any time soon?
Is it still fouled up in the gfortran transition?  ATLAS 3.8 ought to
be considerably easier to handle, since the upstream build process now
does all the shared library stuff on its own.

To the OP: the quick solution is to build your own atlas, lapack & numpy. :-(

Ondrej: did you have a look at the compiler flags that numpy can
choose to use?  Is there another latent bug here (manifesting in
anything that uses numpy.distutils)?

-- 
pdm





More information about the Python-modules-team mailing list