Bug#883615: Acknowledgement ([CRITICAL] Stretch p-u 9.3 breaks NVidia driver and X.org)
Aurelien Jarno
aurelien at aurel32.net
Sun Dec 17 13:56:04 UTC 2017
On 2017-12-17 10:10, Andreas Beckmann wrote:
> I did dig further. An easier target for debugging is glxinfo. Which can be further minimized to
>
> #include <X11/Xlib.h>
> #include <GL/glx.h>
> #include <pthread.h>
> int main()
> {
> pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
> pthread_mutex_lock(&mutex);
> pthread_mutex_unlock(&mutex);
>
> Display * dpy ;
> dpy = XOpenDisplay ( NULL ) ;
>
> pthread_mutex_lock(&mutex);
> pthread_mutex_unlock(&mutex);
>
> int fbAttribSingle[] = {
> GLX_RENDER_TYPE, GLX_RGBA_BIT,
> GLX_RED_SIZE, 1,
> GLX_GREEN_SIZE, 1,
> GLX_BLUE_SIZE, 1,
> GLX_DOUBLEBUFFER, False,
> None };
> GLXFBConfig * configs ;
> int nConfigs ;
> configs = glXChooseFBConfig ( dpy , 0 , fbAttribSingle , & nConfigs ) ;
>
> pthread_mutex_lock(&mutex);
> pthread_mutex_unlock(&mutex);
> }
>
> (link with -lGL -lX11)
>
> that dies at some point in pthread_mutex_lock after several
> calls succeeded:
>
> (gdb) bt
> #0 0x00007ffff754b1d4 in pthread_mutex_lock (mutex=0x7ffff7001180 <dispatchLock>) at forward.c:192
> #1 0x00007ffff6dab007 in LockDispatch () at ../../../src/GLdispatch/GLdispatch.c:144
> #2 __glDispatchNewVendorID () at ../../../src/GLdispatch/GLdispatch.c:198
> #3 0x00007ffff702c3c2 in ?? () from /usr/lib/x86_64-linux-gnu/libGLX.so.0
> #4 0x00007ffff702d1ac in ?? () from /usr/lib/x86_64-linux-gnu/libGLX.so.0
> #5 0x00007ffff7026251 in glXChooseFBConfig () from /usr/lib/x86_64-linux-gnu/libGLX.so.0
> #6 0x0000555555554964 in main () at mwe.c:25
> (gdb) info shared
> From To Syms Read Shared Object Library
> 0x00007ffff7dd9aa0 0x00007ffff7df5340 Yes /lib64/ld-linux-x86-64.so.2
> 0x00007ffff7b745d0 0x00007ffff7b78c1b Yes (*) /usr/lib/x86_64-linux-gnu/libGL.so.1
> 0x00007ffff7812da0 0x00007ffff789a434 Yes (*) /usr/lib/x86_64-linux-gnu/libX11.so.6
> 0x00007ffff7475910 0x00007ffff759f403 Yes /lib/x86_64-linux-gnu/libc.so.6
> 0x00007ffff7252d80 0x00007ffff725394e Yes /lib/x86_64-linux-gnu/libdl.so.2
> 0x00007ffff7024a20 0x00007ffff702ef9d Yes (*) /usr/lib/x86_64-linux-gnu/libGLX.so.0
> 0x00007ffff6daabb0 0x00007ffff6dada37 Yes /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0
> 0x00007ffff6b4fb40 0x00007ffff6b619f5 Yes (*) /usr/lib/x86_64-linux-gnu/libxcb.so.1
> 0x00007ffff6935700 0x00007ffff693f49f Yes (*) /usr/lib/x86_64-linux-gnu/libXext.so.6
> 0x00007ffff672f010 0x00007ffff672fc8c Yes (*) /usr/lib/x86_64-linux-gnu/libXau.so.6
> 0x00007ffff6529340 0x00007ffff652ac48 Yes (*) /usr/lib/x86_64-linux-gnu/libXdmcp.so.6
> 0x00007ffff63153d0 0x00007ffff63225df Yes (*) /lib/x86_64-linux-gnu/libbsd.so.0
> 0x00007ffff610c0e0 0x00007ffff610eecf Yes /lib/x86_64-linux-gnu/librt.so.1
> 0x00007ffff5ef2ab0 0x00007ffff5eff811 Yes /lib/x86_64-linux-gnu/libpthread.so.0
> 0x00007ffff5c00f00 0x00007ffff5c76291 Yes (*) /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
> 0x00007ffff59ab810 0x00007ffff59ad5a3 Yes (*) /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.375.82
> 0x00007ffff3ed7600 0x00007ffff4fbac77 Yes (*) /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.375.82
> 0x00007ffff38d7680 0x00007ffff39438da Yes /lib/x86_64-linux-gnu/libm.so.6
> (gdb) disassemble
> Dump of assembler code for function pthread_mutex_lock:
> 0x00007ffff754b1b0 <+0>: mov 0x2a957a(%rip),%eax # 0x7ffff77f4730 <__libc_pthread_functions_init>
> 0x00007ffff754b1b6 <+6>: test %eax,%eax
> 0x00007ffff754b1b8 <+8>: jne 0x7ffff754b1c0 <pthread_mutex_lock+16>
> 0x00007ffff754b1ba <+10>: xor %eax,%eax
> 0x00007ffff754b1bc <+12>: retq
> 0x00007ffff754b1bd <+13>: nopl (%rax)
> 0x00007ffff754b1c0 <+16>: mov 0x2a94c1(%rip),%rax # 0x7ffff77f4688 <__libc_pthread_functions+264>
> 0x00007ffff754b1c7 <+23>: ror $0x11,%rax
> 0x00007ffff754b1cb <+27>: xor %fs:0x30,%rax
> => 0x00007ffff754b1d4 <+36>: jmpq *%rax
>
> After finally understanding that the fs segment is used for TLS storage
> addressing, I actually saw the difference in the linked libraries:
> /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.375.82 vs.
> /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.375.82
Oh, it's strange this library didn't show up in the ldd tests that
appear earlier in this bug report. I guess it is (indirectly) dlopened.
> From the documentation:
>
> The nvidia-tls libraries (/usr/lib/libnvidia-tls.so.384.98 and /usr/lib/tls/libnvidia-tls.so.384.98); these files provide thread local storage support for the NVIDIA OpenGL libraries (libGL, libnvidia-glcore, and libglx). Each nvidia-tls library provides support for a particular thread local storage model (such as ELF TLS), and the one appropriate for your system will be loaded at run time.
>
> and from the source code of nvidia-installer (which we don't use):
>
> "NVIDIA's OpenGL libraries are compiled with one of two "
> "different thread local storage (TLS) mechanisms: 'classic tls' "
> "which is used on systems with glibc 2.2 or older, and 'new tls' "
> "which is used on systems with tls-enabled glibc 2.3 or newer. "
>
Yes exactly. "New" TLS mechanism is implemented in the NPTL (as opposed
to LinuxThreads) and required a 2.6 kernel minimum (as opposed to 2.4)
to work. The hardware capabilities mechanism has been slightly abused to
export the ability of the kernel to support NPTL. This has been done
that way as RedHat backported all the NPTL support to its 2.4 kernel.
The Debian glibc package therefore provided two different libc depending
on the running kernel, one in /lib and the other in /lib/tls. Debian
dropped the 2.4 kernel support in Lenny, and thus only the glibc with
the "new" TLS mechanism was provided. As a consequence all the packages
stopped using the tls directory as the new mechanism was guaranteed to
be supported. IIRC we made sure that all libraries have been moved out
of the tls/ directory, but I guess we missed the nvidia library as it
was in non-free.
> So we probably shouldn't ship the classic ones at all and move the new
> ones to the regular library directory (nvidia seems to be the only package
> still shipping stuff in tls/)
Indeed that is the correct fix. I am actually surprised that Nvidia
still provides a library built for such an old glibc and I wonder how
they build it.
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien at aurel32.net http://www.aurel32.net
More information about the pkg-nvidia-devel
mailing list