cuda broken after upgrade (x86_64, Jessie, GTX980)
Alois Schloegl
alois.schloegl at ist.ac.at
Sun Feb 28 23:53:40 UTC 2016
When trying to upgrade cuda on a Debian/Jessie machine with 4 GTX980
cards, cuda became unusable. nvidia-smi reports this error:
# nvidia-smi
Failed to initialize NVML: GPU access blocked by the operating system
and a short cuda test program fails also because it does not find the GPU.
Here is some diagnostic information about the system.
First, the installed nvidia packages:
# dpkg -l|grep -i nvidia
ii glx-alternative-nvidia 0.7.1~bpo8+1
amd64 allows the selection of NVIDIA as GLX provider
ii libcublas6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA cuBLAS Library
ii libcuda1:amd64 352.79-1~bpo8+1
amd64 NVIDIA CUDA Driver Library
ii libcudart6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA CUDA Runtime Library
ii libcufft6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA cuFFT Library
ii libcufftw6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA cuFFTW Library
ii libcuinj64-6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA CUINJ Library (64-bit)
ii libcurand6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA cuRAND Library
ii libcusparse6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA cuSPARSE Library
ii libegl1-nvidia:amd64 352.79-1~bpo8+1
amd64 NVIDIA binary EGL libraries
ii libgl1-nvidia-glx:amd64 352.79-1~bpo8+1
amd64 NVIDIA binary OpenGL libraries
ii libgles1-nvidia:amd64 352.79-1~bpo8+1
amd64 NVIDIA binary OpenGL|ES 1.x libraries
ii libgles2-nvidia:amd64 352.79-1~bpo8+1
amd64 NVIDIA binary OpenGL|ES 2.x libraries
ii libnppc6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA Performance Primitives core runtime library
ii libnppi6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA Performance Primitives for image processing
runtime library
ii libnpps6.5:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA Performance Primitives for signal processing
runtime library
ii libnvidia-compiler:amd64 352.79-1~bpo8+1
amd64 NVIDIA runtime compiler library
ii libnvidia-eglcore:amd64 352.79-1~bpo8+1
amd64 NVIDIA binary EGL core libraries
ii libnvidia-ml1:amd64 352.79-1~bpo8+1
amd64 NVIDIA Management Library (NVML) runtime library
ii libnvtoolsext1:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA Tools Extension Library
ii libnvvm2:amd64 6.5.19-3~bpo8+1
amd64 NVIDIA NVVM Library
ii nvidia-alternative 352.79-1~bpo8+1
amd64 allows the selection of NVIDIA as GLX provider
ii nvidia-cuda-dev 6.5.19-3~bpo8+1
amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 6.5.19-3~bpo8+1
all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 6.5.19-3~bpo8+1
amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-mps 352.79-1~bpo8+1
amd64 NVIDIA CUDA Multi Process Service (MPS)
ii nvidia-cuda-toolkit 6.5.19-3~bpo8+1
amd64 NVIDIA CUDA development toolkit
ii nvidia-detect 352.79-1~bpo8+1
amd64 NVIDIA GPU detection utility
ii nvidia-driver 352.79-1~bpo8+1
amd64 NVIDIA metapackage
ii nvidia-driver-bin 352.79-1~bpo8+1
amd64 NVIDIA driver support binaries
ii nvidia-installer-cleanup 20151021+1~bpo8+1
amd64 cleanup after driver installation with the
nvidia-installer
ii nvidia-kernel-common 20151021+1~bpo8+1
amd64 NVIDIA binary kernel module support files
ii nvidia-kernel-dkms 352.79-1~bpo8+1
amd64 NVIDIA binary kernel module DKMS source
ii nvidia-kernel-source 352.79-1~bpo8+1
amd64 NVIDIA binary kernel module source
ii nvidia-kernel-support 352.79-1~bpo8+1
amd64 NVIDIA binary kernel module support files
ii nvidia-modprobe 358.09-1~bpo8+1
amd64 utility to load NVIDIA kernel modules and create
device nodes
ii nvidia-opencl-common 352.79-1~bpo8+1
amd64 NVIDIA OpenCL driver
ii nvidia-profiler 6.5.19-3~bpo8+1
amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-settings 340.93-1~bpo8+1
amd64 tool for configuring the NVIDIA graphics driver
ii nvidia-smi 352.79-1~bpo8+1
amd64 NVIDIA System Management Interface
ii nvidia-support 20151021+1~bpo8+1
amd64 NVIDIA binary graphics driver support files
ii nvidia-vdpau-driver:amd64 352.79-1~bpo8+1
amd64 Video Decode and Presentation API for Unix - NVIDIA
driver
ii nvidia-xconfig 340.46-1
amd64 X configuration tool for non-free NVIDIA drivers
ii xserver-xorg-video-nvidia 352.79-1~bpo8+1
amd64 NVIDIA binary Xorg driver
# nvidia-detect
Detected NVIDIA GPUs:
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204
[GeForce GTX 980] [10de:13c0] (rev a1)
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204
[GeForce GTX 980] [10de:13c0] (rev a1)
82:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204
[GeForce GTX 980] [10de:13c0] (rev a1)
83:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204
[GeForce GTX 980] [10de:13c0] (rev a1)
Checking card: NVIDIA Corporation GM204 [GeForce GTX 980] (rev a1)
Your card is only supported by the updated drivers from jessie-backports.
See http://backports.debian.org for instructions how to use backports.
It is recommended to install the
nvidia-driver/jessie-backports
package.
Checking card: NVIDIA Corporation GM204 [GeForce GTX 980] (rev a1)
Your card is only supported by the updated drivers from jessie-backports.
See http://backports.debian.org for instructions how to use backports.
It is recommended to install the
nvidia-driver/jessie-backports
package.
Checking card: NVIDIA Corporation GM204 [GeForce GTX 980] (rev a1)
Your card is only supported by the updated drivers from jessie-backports.
See http://backports.debian.org for instructions how to use backports.
It is recommended to install the
nvidia-driver/jessie-backports
package.
Checking card: NVIDIA Corporation GM204 [GeForce GTX 980] (rev a1)
Your card is only supported by the updated drivers from jessie-backports.
See http://backports.debian.org for instructions how to use backports.
It is recommended to install the
nvidia-driver/jessie-backports
package.
Trying to install these packages as suggested:
# apt-get install nvidia-driver/jessie-backports
Reading package lists... Done
Building dependency tree
Reading state information... Done
nvidia-driver is already the newest version.
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'nvidia-driver'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'libgl1-nvidia-glx' because of 'nvidia-driver'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'nvidia-alternative' because of 'libgl1-nvidia-glx'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'libegl1-nvidia' because of 'nvidia-driver'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'libnvidia-eglcore' because of 'libegl1-nvidia'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'nvidia-driver-bin' because of 'nvidia-driver'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'libnvidia-ml1' because of 'nvidia-driver-bin'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'xserver-xorg-video-nvidia' because of 'nvidia-driver'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'nvidia-vdpau-driver' because of 'xserver-xorg-video-nvidia'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'libgles1-nvidia' because of 'nvidia-driver'
Selected version '352.79-1~bpo8+1' (Debian Backports:jessie-backports
[amd64]) for 'libgles2-nvidia' because of 'nvidia-driver'
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.
Still, nvidia-detect will report the same message (shown above). This
seems like a bug to me.
Do you have any recommendations, how to make cuda and nvidia-smi usable
again ?
Best,
Alois
More information about the pkg-nvidia-devel
mailing list