Bug#883615: Acknowledgement ([CRITICAL] Stretch p-u 9.3 breaks NVidia driver and X.org)

Julien Aubin julien.aubin at gmail.com
Thu Dec 7 17:33:19 UTC 2017


Hi,

Ldd with file /etc/ld.so.nohwcap :
julien at pccorei7-4770:~$ ldd $(which glxgears)
       linux-vdso.so.1 (0x00007ffcc49c5000)
       libGLEW.so.2.0 => /usr/lib/x86_64-linux-gnu/libGLEW.so.2.0
(0x00007f9327cc6000)
       libGLU.so.1 => /usr/lib/x86_64-linux-gnu/libGLU.so.1
(0x00007f9327a57000)
       libGL.so.1 => /usr/lib/x86_64-linux-gnu/libGL.so.1
(0x00007f93277b3000)
       libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f93274af000)
       libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6
(0x00007f932716f000)
       libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6
(0x00007f9326f5d000)
       libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9326bbe000)
       libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(0x00007f932683c000)
       libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
(0x00007f9326625000)
       libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9326421000)
       libGLX.so.0 => /usr/lib/x86_64-linux-gnu/libGLX.so.0
(0x00007f93261f1000)
       libGLdispatch.so.0 => /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0
(0x00007f9325f23000)
       /lib64/ld-linux-x86-64.so.2 (0x00007f9328161000)
       libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1
(0x00007f9325cfb000)
       libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6
(0x00007f9325af7000)
       libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6
(0x00007f93258f1000)
       libbsd.so.0 => /lib/x86_64-linux-gnu/libbsd.so.0
(0x00007f93256db000)
       librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f93254d3000)
       libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x00007f93252b6000)

Ldd without file /etc/ld.so.nohwcap :
julien at pccorei7-4770:~$ ldd $(which glxgears)
       linux-vdso.so.1 (0x00007ffce55b0000)
       libGLEW.so.2.0 => /usr/lib/x86_64-linux-gnu/libGLEW.so.2.0
(0x00007f90c7a69000)
       libGLU.so.1 => /usr/lib/x86_64-linux-gnu/libGLU.so.1
(0x00007f90c77fa000)
       libGL.so.1 => /usr/lib/x86_64-linux-gnu/libGL.so.1
(0x00007f90c7556000)
       libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f90c7252000)
       libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6
(0x00007f90c6f12000)
       libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6
(0x00007f90c6d00000)
       libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f90c6961000)
       libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(0x00007f90c65df000)
       libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
(0x00007f90c63c8000)
       libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f90c61c4000)
       libGLX.so.0 => /usr/lib/x86_64-linux-gnu/libGLX.so.0
(0x00007f90c5f94000)
       libGLdispatch.so.0 => /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0
(0x00007f90c5cc6000)
       /lib64/ld-linux-x86-64.so.2 (0x00007f90c7f04000)
       libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1
(0x00007f90c5a9e000)
       libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6
(0x00007f90c589a000)
       libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6
(0x00007f90c5694000)
       libbsd.so.0 => /lib/x86_64-linux-gnu/libbsd.so.0
(0x00007f90c547e000)
       librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f90c5276000)
       libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0
(0x00007f90c5059000)

I remark the addresses of libraries are not the same but it is probably due
to ASLR.

Another thing I was thinking of : if the issue is related to code paths,
maybe we should try on some other CPUs w/ the NVidia blob, because it might
be a Haswell hardware bug. What do you think of this ? We should try on
some Intel non-Haswell CPU w/ and w/o the file.

Rgds,

2017-12-07 10:06 GMT+01:00 Aurelien Jarno <aurelien at aurel32.net>:

> control: reopen -1
> control: tag -1 - unreproducible
> control: retitle -1 nvidia-driver: crashes with /etc/ld.so.nohwcap
>
> On 2017-12-07 05:43, Julien Aubin wrote:
> > 2017-12-06 21:50 GMT+01:00 Aurelien Jarno <aurelien at aurel32.net>:
> >
> > > On 2017-12-06 19:39, Julien Aubin wrote:
> > > > Weird... this time I re-upgraded libc6 and things work fine... looks
> like
> > > > something wrong went during the install. And I cannot reproduce the
> issue
> > > > anymore... :'( WTF ???
> > >
> > > Hmm, a bug has been introduced in libc6 version 2.24-11+deb9u2, which
> in
> > > some conditions leave the /etc/ld.so.nohwcap file instead of removing
> it
> > > just after the upgrade (see bug#883394). One of the condition is to
> have
> > > libc6-i686 installed (while it can be safely removed), which seems to
> be
> > > your case.
> > >
> > > I consider this bug harmless as it should not deactivate anything now
> > > that the default libc is already i686 optimized. Also I don't see how
> it
> > > could trigger the issue you described. Anyway better be safe than
> sorry,
> > > could you please try to create this file with "touch
> /etc/ld.so.nohwcap"
> > > as root and see if it makes the issue to reappear? Once the test is
> done
> > > you can then remove it.
> > >
> > > Thanks,
> > > Aurelien
> > >
> >
> >
> > Bingo ! It was exactly this !
> >
> > If I re-create the file for example it crashes glxgears. When I remove it
> > glxgears works fine.
> >
> > With GDB, the stack trace for when I run glxgears :
> >
> > 0  0x00007ffff6b311a4 in pthread_mutex_lock (mutex=0x7ffff604e8c0) at
> > forward.c:192
> > #1  0x00007ffff5de1308 in __glDispatchNewVendorID () from
> > /usr/lib/x86_64-linux-gnu/libGLdispatch.so.0
> > #2  0x00007ffff60793c2 in ?? () from /usr/lib/x86_64-linux-gnu/
> libGLX.so.0
> > #3  0x00007ffff607a1ac in ?? () from /usr/lib/x86_64-linux-gnu/
> libGLX.so.0
> > #4  0x00007ffff6073170 in glXChooseVisual () from
> > /usr/lib/x86_64-linux-gnu/libGLX.so.0
> > #5  0x000055555555779f in ?? ()
> > #6  0x0000555555555ae7 in ?? ()
> > #7  0x00007ffff6a5c2e1 in __libc_start_main (main=0x555555555970, argc=1,
> > argv=0x7fffffffe638, init=<optimized out>,
> >    fini=<optimized out>, rtld_fini=<optimized out>,
> > stack_end=0x7fffffffe628) at ../csu/libc-start.c:291
> > #8  0x000055555555646a in ?? ()
> >
>
> The libc6 package version 2.24-11+deb9u2 won't be in the next point
> release so this issue won't trigger. That said there is clearly an
> issue on the nvidia package, it should behave the same with and
> without /etc/ld.so.nohwcap. I am therefore reopening this bug.
>
> One of the first step to debug this issue would be to run ldd
> /usr/bin/glxgears with and without /etc/ld.so.nohwcap and compare the
> difference.
>
> Aurelien
>
> --
> Aurelien Jarno                          GPG: 4096R/1DDD8C9B
> aurelien at aurel32.net                 http://www.aurel32.net
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-nvidia-devel/attachments/20171207/14663719/attachment.html>


More information about the pkg-nvidia-devel mailing list