Bug#849150: patch proposal

Ximin Luo infinity0 at debian.org
Sun Jul 23 11:16:00 UTC 2017


On Tue, 11 Jul 2017 13:24:02 +0200 Frédéric Bonnard <frediz at linux.vnet.ibm.com> wrote:
> Tags: patch
> User: debian-powerpc at lists.debian.org
> Usertags: ppc64el 
> 
> --
> 
> Hi,
> it just seems that there's too many space taken by different libraries
> in the static TLS space. I contacted some people from the toolchain,
> especially Alan Modra which seems to confirm that :
> "If sagemath is dlopen'ing libraries, one of which is libgomp or has a
> dependency on libgomp, and the sagemath executable itself does not load
> libgomp at startup, then that would explain the error you're seeing."
> Python binary has no direct dependency on libgomp :
> $ lddtree /usr/bin/python2.7
> python2.7 => /usr/bin/python2.7 (interpreter => /lib64/ld64.so.2)
>     libpthread.so.0 => /lib/powerpc64le-linux-gnu/libpthread.so.0
>         ld64.so.2 => /lib64/ld64.so.2
>     libdl.so.2 => /lib/powerpc64le-linux-gnu/libdl.so.2
>     libutil.so.1 => /lib/powerpc64le-linux-gnu/libutil.so.1
>     libz.so.1 => /lib/powerpc64le-linux-gnu/libz.so.1
>     libm.so.6 => /lib/powerpc64le-linux-gnu/libm.so.6
>     libc.so.6 => /lib/powerpc64le-linux-gnu/libc.so.6
> 
> And also :
> sagemath-7.6/sage# LD_DEBUG=files
> PYTHONPATH=/build/sagemath-wDWVd1/sagemath-7.6/debian/build/usr/lib/python2.7/dist-packages
> /build/sagemath-wDWVd1/sagemath-7.6/sage/src/bin/sage --docbuild
> --no-pdf-links all html
> ...
>     [..]
> <error is just below>
> ...
> So the failure occurs while importing the python module matrix_modn_dense_float.so.
> So I propose to preload libgomp which looks good to Alan.
> 
> As Ximin explained, this workaround should not be applied on
> documentation build only, as the import should trigger the error on the
> CLI as well, thus I inserted LD_PRELOAD export in sage-env, for ppc64el
> only. So here is a debdiff for you to review.
> I hope that will help,
> 
> [..]

Hi, thanks very much for the investigation and explanation! I am not sure this patch is the best approach however. Also I don't yet completely understand what is wrong, I'm still guessing some things based on your explanation:

Firstly your patch is for ppc64el but the same error occurs also on arm64 and possibly other platforms - we'll only know for sure, after we get the right Build-Dependencies into Debian on those other platforms.

I don't think it's a long-term sustainable approach to hardcode architecture-specific exceptions. What *aspect* of ppc64el requires this patch? Am I understanding correctly that dlopen(), for some reason, loads stuff into thread-local-storage (TLS) instead of a shared area between all threads? And that this space is running out on ppc64el (and arm64)? Why doesn't it happen on amd64 / x86_64? This sounds like a bug in dlopen() or the threading library, or something else?

Even if not, shouldn't it be possible to predict that the space will run out on any platform in a generic way, in order to raise the limit or to do this LD_PRELOAD workaround, in a cross-platform way?

(For example, on rustc recently we had a nasty issue on ppc64el but the underlying reason was due to interaction between PAGESIZE and newer Linux kernel stack behaviour, and the workaround I wrote was conditioned on PAGESIZE rather than ppc64el specifically.)

Finally, ideally we would push the patch upstream, though testing it out in Debian first would be good - I think we have easier access to some platforms than upstream does. However I'd expect that the chances of Sage accepting a ppc64el-specific patch are very slim. And this one has a DEB_* variable in.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



More information about the debian-science-maintainers mailing list