Bug#944874: strange phenomenon with primus + nvidia-tesla

Andreas Beckmann anbe at debian.org
Sat Nov 16 18:22:51 GMT 2019


Source: nvidia-graphics-drivers-tesla
Version: 418.87.01-1
Severity: important
Control: submitter -1 patrice.duroux at igh.cnrs.fr

-------- Forwarded Message --------
Subject: strange phenomenon  with primus +  nvidia-tesla
Date: Fri, 15 Nov 2019 21:23:40 +0000
From: Patrice DUROUX <patrice.duroux at igh.cnrs.fr>
To: pkg-nvidia-devel at alioth-lists.debian.net <pkg-nvidia-devel at alioth-lists.debian.net>

Hi,

Since version 418.88 of nvidia was removed from Debian, my laptop device that
is:
01:00.0 VGA compatible controller: NVIDIA Corporation GK208GLM [Quadro K610M] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev ff)
was no more supported anymore neither by the 390xx nor by the new 430xx.
So I decided to remove entirely all the stack primus nvidia to be back with nouveau
(but impossible to use primus tech with it, may be unsupported for this device).

Recently nvidia-tesla was pushed into experimental, so I am trying to setup
again primus + nvidia. Starting from a fresh boot of Debian Sid without any
related packages (to nvidia + primus) and then installing nvidia-tesla + primus,
I am able to get 'optirun glxgears -info' work fine (after some manual
adjustment moreover like setting glx alternative correctly).

But then after a reboot, it is no more working. Here is the log message:
[   56.224448] nvidia-nvlink: Nvlink Core is being initialized, major device number 241
[   56.224824] nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=none:owns=none
[   56.224847] NVRM: The NVIDIA GPU 0000:01:00.0
               NVRM: (PCI ID: 10de:12b9) installed in this system has
               NVRM: fallen off the bus and is not responding to commands.
[   56.224890] nvidia: probe of 0000:01:00.0 failed with error -1
[   56.224906] NVRM: The NVIDIA probe routine failed for 1 device(s).
[   56.224907] NVRM: None of the NVIDIA graphics adapters were initialized!
[   56.250709] nvidia-nvlink: Unregistered the Nvlink Core, major device number 241

Should it be possible that there are some bad interactions with the hda hdmi
audio driver that the kernel driver tries to setup?

nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: Unable to sync register 0x4f0100. -5
nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD buf size -1
nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD buf size -1
nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: HDMI: invalid ELD buf size -1
nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: out of range cmd 0:4:707:ffffffff
nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: out of range cmd 0:4:707:ffffffbf
nov. 15 21:50:40 hp-dark kernel: snd_hda_codec_hdmi hdaudioC1D0: out of range cmd 0:4:707:ffffffff

I am currently able to reproduce this on my system. Removing everything,
rebooting, then install again the necessary packages, the adjusting a bit the
config and get it work with optirun. But then after a reboot I am loosing this
ability again.

Don't know how to help?

Thanks,
Patrice

-------- Forwarded Message --------
Subject: re: strange phenomenon with primus + nvidia-tesla
Date: Sat, 16 Nov 2019 15:22:37 +0100
From: Patrice Duroux <duroux.patrice at orange.fr>
To: pkg-nvidia-devel at alioth-lists.debian.net <pkg-nvidia-devel at alioth-lists.debian.net>


More progress on the trouble, now 'optirun glxgears' is working even after some
reboots (not clear why unless an update in sid since yesterday). But now here is
what I am getting looking at journalctl each time closing the glxgears X11
window:

nov. 16 15:07:09 hp-dark systemd[1]: Created slice system-systemd\x2dcoredump.slice.
nov. 16 15:07:09 hp-dark systemd[1]: Started Process Core Dump (PID 3612/UID 0).
nov. 16 15:07:10 hp-dark kernel: nvidia-modeset: Unloading
nov. 16 15:07:10 hp-dark kernel: nvidia-nvlink: Unregistered the Nvlink Core, major device number 241
nov. 16 15:07:10 hp-dark kernel: bbswitch: disabling discrete graphics
nov. 16 15:07:10 hp-dark kernel: pci 0000:01:00.0: Refused to change power state, currently in D0
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129532] [ERROR][XORG] (EE)
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129551] [ERROR][XORG] (EE) Backtrace:
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129556] [ERROR][XORG] (EE) 0: /usr/lib/xorg/Xorg (OsLookupColor+0x139) [0x556c1ac5d2c9]
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129561] [ERROR][XORG] (EE) 1: /lib/x86_64-linux-gnu/libpthread.so.0 (funlockfile+0x50) [0x7f7885c6355f]
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129565] [ERROR][XORG] (EE) 2: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.418.87.01 (_nv015glcore+0x141a0) [0>
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129569] [ERROR][XORG] (EE) 3: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.418.87.01 (_nv015glcore+0x1437e) [0>
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129574] [ERROR][XORG] (EE) 4: /usr/lib/x86_64-linux-gnu/nvidia/libGL.so.1 (glXCreateNewContext+0x9c18) [0x7f788>
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129578] [ERROR][XORG] (EE)
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129582] [ERROR][XORG] (EE) Segmentation fault at address 0x8
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129585] [ERROR][XORG] (EE)
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129589] [ERROR][XORG] (EE) Caught signal 11 (Segmentation fault). Server aborting
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129593] [ERROR][XORG] (EE)
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129596] [ERROR][XORG] (EE)
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129601] [ERROR][XORG] (EE) Please also check the log file at "/var/log/Xorg.8.log" for additional information.
nov. 16 15:07:10 hp-dark bumblebeed[707]: [  231.129605] [ERROR][XORG] (EE)
nov. 16 15:07:10 hp-dark systemd-coredump[3613]: Process 3586 (Xorg) of user 0 dumped core.
                                                                                                  Stack trace of thread 3586:
                                                 #0  0x00007f7885ac8081 __GI_raise (libc.so.6)
                                                 #1  0x00007f7885ab3535 __GI_abort (libc.so.6)
                                                 #2  0x0000556c1ac5fdea OsAbort (Xorg)
                                                 #3  0x0000556c1ac65903 n/a (Xorg)
                                                 #4  0x0000556c1ac66769 FatalError (Xorg)
                                                 #5  0x0000556c1ac5d201 n/a (Xorg)
                                                 #6  0x00007f7885c63510 __restore_rt (libpthread.so.0)
                                                 #7  0x00007f7884189460 n/a (libnvidia-glcore.so.418.87.01)
                                                 #8  0x00007f788418963e n/a (libnvidia-glcore.so.418.87.01)
                                                 #9  0x00007f78852ab588 n/a (libGL.so.1)
                                                 #10 0x00007f78853220d5 n/a (libGL.so.1)
                                                 #11 0x00007f7886bc6965 _dl_fini (ld-linux-x86-64.so.2)
                                                 #12 0x00007f7885aca720 __run_exit_handlers (libc.so.6)
                                                 #13 0x00007f7885aca85a __GI_exit (libc.so.6)
                                                 #14 0x00007f7885ab4bc2 __libc_start_main (libc.so.6)
                                                 #15 0x0000556c1aaec67a _start (Xorg)
nov. 16 15:07:10 hp-dark systemd[1]: systemd-coredump at 0-3612-0.service: Succeeded.



More information about the pkg-nvidia-devel mailing list