Bug#807244: libegl1-nvidia: Programs crash due to elisian-unlock on skylake processor with nvidia driver 352.63-1 (experimental)

Andreas Beckmann anbe at debian.org
Mon Dec 7 22:26:18 UTC 2015


Dear libc maintainers,

we recently got a bug report regarding the TSX-NI / lock elision bug in
combination with the non-free nvidia driver (#807244). Since that is
supposed to be fixed with the libc in experimental (and now sid as
well), perhaps you could take a look why this still happens.
Several forum posts denote that "compiling glibc without
--enable-lock-elision" works around that issue.

A few ideas from my side, but since I don't have the hardware to test, I
cannot check anything:
* that specific CPU needs to be blacklisted / is incorrectly whitelisted
* nvidia utilizes a code path in libc that is not covered by the current
patch (and that code path is not used by any other application)
* nvidia does call something it shouldn't call directly ... thus
circumenting the runtime-disabling of the specific routines in libc6
* nvidia code does issue the problematic instructions itself (but the
backtrace points to libc, so this sounds unlikely)

Is there some way to check at runtime how lock elision is handled by
libc (on a concrete system)?

Andreas

On 2015-12-06 17:53, Jelle Haandrikman wrote:
> On a system with an Nvidia GTX 970, Intel Skylake i5-6600k running driver
> 352.63-1 (experimental) several programs crash due to TSX-NI / elision unlock.
> This affects sddm, unlocking kscreen, vlc and deleting files using dolphin.
> 
> Other people also have found this issue.
> http://www.phoronix.com/forums/forum/linux-graphics-x-org-drivers/nvidia-linux/825702-nvidia-s-latest-binary-driver-is-causing-problems-for-some-skylake-linux-users
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=800574 #800574
> https://devtalk.nvidia.com/default/topic/893325/newest-and-beta-linux-driver-causing-segmentation-fault-core-dumped-on-all-skylake-platforms/
> 
> Bug #800574 suggest to disable elisian-unlock in glibc. Which is already
> incorporated in experimental. This does not alleviate the issue. See the "steps
> to reproduce" below. The same bug suggests that the nvidia driver still has
> problems. I also run intel-microcode update, but that doesn't solve anything.

> Step to reproduce: gdb vlc
> output:
> (gdb) run
> Starting program: /usr/bin/vlc
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> VLC media player 2.2.1 Terry Pratchett (Weatherwax) (revision 2.2.1-0-ga425c42)
> 
> Program received signal SIGSEGV, Segmentation fault.
> __lll_unlock_elision (lock=0x7ffff26d0d08, private=0)
>     at ../sysdeps/unix/sysv/linux/x86/elision-unlock.c:29
> 29      ../sysdeps/unix/sysv/linux/x86/elision-unlock.c: No such file or
> directory.
> (gdb) bt
> #0  __lll_unlock_elision (lock=0x7ffff26d0d08, private=0)
>     at ../sysdeps/unix/sysv/linux/x86/elision-unlock.c:29
> #1  0x00007ffff247f26c in ?? () from /usr/lib/x86_64-linux-gnu/libEGL.so.1
> #2  0x00007ffff240fa22 in ?? () from /usr/lib/x86_64-linux-gnu/libEGL.so.1
> #3  0x00007fffffffd960 in ?? ()
> #4  0x00007ffff2493ea1 in ?? () from /usr/lib/x86_64-linux-gnu/libEGL.so.1
> #5  0x00007fffffffd960 in ?? ()
> #6  0x00007ffff7def59e in _dl_close_worker (map=<optimized out>,
> force=<optimized out>)
>     at dl-close.c:291
> Backtrace stopped: previous frame inner to this frame (corrupt stack?)
> 
> /usr/lib/x86_64-linux-gnu/libEGL.so.1 -> /usr/lib/x86_64-linux-
> gnu/nvidia/libEGL.so.1
> 
> "dmesg|grep pthread" result:
> breetai at mainbak:~$ dmesg |grep pthread
> [73330.105569] traps: vlc[16815] general protection ip:7f47ac388950
> sp:7ffe3908ad98 error:0 in libpthread-2.22.so[7f47ac376000+18000]
> [78860.282876] traps: dolphin[18294] general protection ip:7fc3b0c1b950
> sp:7ffd0a0828d8 error:0 in libpthread-2.22.so[7fc3b0c09000+18000]
> [90812.515421] traps: krunner[20723] general protection ip:7f930fa19950
> sp:7ffc9b5cd988 error:0 in libpthread-2.22.so[7f930fa07000+18000]
> [90826.164341] traps: akonadi_migrati[21161] general protection ip:7f33b7e39950
> sp:7fff9d61bef8 error:0 in libpthread-2.22.so[7f33b7e27000+18000]
> [92621.782318] traps: vlc[21962] general protection ip:7f4241467950
> sp:7ffd8fa98f68 error:0 in libpthread-2.22.so[7f4241455000+18000]
> breetai at mainbak:~$
> 
> 
> installed packages:
> System runs testing.
> 
> libc6:amd64         2.22-0experimental0 from experimental
> nvidia-driver       352.63-1            from experimental
> intel-microcode     3.20151106.1        from testing
> vlc                 2.2.1-5+b1          from testing



More information about the pkg-nvidia-devel mailing list