cuda with gtx980 on Jessie

Alois Schloegl alois.schloegl at ist.ac.at
Fri Sep 11 17:03:00 UTC 2015


On 09/11/2015 06:18 PM, Luca Boccassi wrote:
> On Fri, 2015-09-11 at 16:22 +0200, Alois Schloegl wrote:
>> I removed all the foreign packages. The nvidia-cuda-toolkit 6.5 from
>> experimental did not solve the problem, but when downgrading
>> nvidia-cuda-toolkit the stable version, and and installing the drivers
>> from experimental is working.
> 
> Happy that now it works! But it should also work with 6.5 from
> experimental. Have you tried it since removing the foreign packages?



Yes, I've tested 6.5 even before I went back to 6.0. I tested a number
of things, and uninstalling the foreign packages was also not really
straight forward. So, it's quite possible that there is still something
ill-configured. Especially, I still see the following issues:


(1) The test with a sort cuda program containing
     cudaGetDeviceCount(&ngpu);
 works fine under root, but fails in user space. It can be compiled in
userspace, but no gpu is identified.
Only, after setting the suid bit, this this works as expected.

(2)  When running nvidia-smi (as root), I still see the error:
  Failed to initialize NVML: GPU access blocked by the operating system

(3) when trying to run in user space, "nvidia-settings", I see the error

    You do not appear to be using the NVIDIA X driver.
    Please edit your X configuration file (just run
    `nvidia-xconfig` as root), and restart the X server.

Running nvidia-xconfig as root and restarting X does not change anything.


Moreover, the configuration of the modules are not completely clear yet.
Specifically, the entry  "rmmod" in /etc/modprobe.d/nvidia.conf seems
suspicious. Please see also to content of the module configuration
files, that might be relevent for this.


/etc/modprobe.d/nvidia-blacklists-nouveau.conf:# You need to run
"update-initramfs -u" after editing this file.
/etc/modprobe.d/nvidia-blacklists-nouveau.conf:
/etc/modprobe.d/nvidia-blacklists-nouveau.conf:# see #580894
/etc/modprobe.d/nvidia-blacklists-nouveau.conf:blacklist nouveau

/etc/modprobe.d/nvidia.conf:alias nvidia nvidia-current
/etc/modprobe.d/nvidia.conf:remove nvidia-current rmmod nvidia-uvm nvidia

/etc/modprobe.d/blacklist-nouveau.conf:blacklist nouveau
/etc/modprobe.d/blacklist-nouveau.conf:options nouveau modeset=0

/etc/modprobe.d/nvidia-kernel-common.conf:alias char-major-195* nvidia
/etc/modprobe.d/nvidia-kernel-common.conf:options nvidia
NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=44 NVreg_DeviceFileMode=0660
/etc/modprobe.d/nvidia-kernel-common.conf:# To enable FastWrites and
Sidebus addressing, uncomment these lines
/etc/modprobe.d/nvidia-kernel-common.conf:# options nvidia
NVreg_EnableAGPSBA=1
/etc/modprobe.d/nvidia-kernel-common.conf:# options nvidia
NVreg_EnableAGPFW=1

/etc/modprobe.d/dkms.conf:# modprobe information used for DKMS modules
/etc/modprobe.d/dkms.conf:#
/etc/modprobe.d/dkms.conf:# This is a stub file, should be edited when
needed,
/etc/modprobe.d/dkms.conf:# used by default by DKMS.

/etc/modules-load.d/modules.conf:# /etc/modules: kernel modules to load
at boot time.
/etc/modules-load.d/modules.conf:#
/etc/modules-load.d/modules.conf:# This file contains the names of
kernel modules that should be loaded
/etc/modules-load.d/modules.conf:# at boot time, one per line. Lines
beginning with "#" are ignored.
/etc/modules-load.d/modules.conf:
/etc/modules-load.d/modules.conf:mlx4_ib
/etc/modules-load.d/modules.conf:ib_ipoib
/etc/modules-load.d/modules.conf:ib_umad
/etc/modules-load.d/modules.conf:svcrdma
/etc/modules-load.d/modules.conf:rdma_ucm
/etc/modules-load.d/modules.conf:rdma_cm
/etc/modules-load.d/modules.conf:ib_mthca
/etc/modules-load.d/modules.conf:xprtrdma
/etc/modules-load.d/modules.conf:#mlx4_en


I'm appreciating your support in this,

Best,
  Alois

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-nvidia-devel/attachments/20150911/825cc54b/attachment.sig>


More information about the pkg-nvidia-devel mailing list