Bug#1074350: nvidia-kernel-dkms: Trying to modprobe nvidia-peermem to use NCCL/RDMA/Infiniband with GPUs

Jeffrey Mark Siskind qobi at qobi.org
Thu Jun 27 13:13:39 BST 2024


   Next weekend with the next bookworm point release we will switch to the 
   535 driver series in bookworm. Packages are already available in 
   stable-proposed-updates.
   Please try that new version, and if the problem persists, we will dig 
   deeper.

enclosed

    Jeff (http: //engineering.purdue.edu/~qobi)
---------------------------------------------------------------------------------
root at sapiencia:~# dmesg|tail
[   61.725203] input: Lenovo Bluetooth Mouse Keyboard as /devices/virtual/misc/uhid/0005:17EF:60EF.0002/input/input23
[   61.725269] input: Lenovo Bluetooth Mouse as /devices/virtual/misc/uhid/0005:17EF:60EF.0002/input/input24
[   61.725323] hid-generic 0005:17EF:60EF.0002: input,hidraw1: BLUETOOTH HID v0.15 Mouse [Lenovo Bluetooth Mouse] on ac:ed:5c:df:10:cd
[   62.841702] Bluetooth: hci0: Ignoring error of Inquiry Cancel command
[   86.493960] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[   86.524275] nvidia-uvm: Loaded the UVM driver, major device number 504.
[   98.932325] input: Lenovo Bluetooth Mouse as /devices/virtual/misc/uhid/0005:17EF:60EF.0003/input/input25
[   98.932492] input: Lenovo Bluetooth Mouse Keyboard as /devices/virtual/misc/uhid/0005:17EF:60EF.0003/input/input26
[   98.932577] input: Lenovo Bluetooth Mouse as /devices/virtual/misc/uhid/0005:17EF:60EF.0003/input/input27
[   98.932725] hid-generic 0005:17EF:60EF.0003: input,hidraw1: BLUETOOTH HID v0.15 Mouse [Lenovo Bluetooth Mouse] on ac:ed:5c:df:10:cd

root at sapiencia:~# modprobe nvidia-peermem
modprobe: ERROR: could not insert 'nvidia_current_peermem': Invalid argument
modprobe: ERROR: ../libkmod/libkmod-module.c:1047 command_do() Error running install command 'modprobe nvidia ; modprobe -i nvidia-current-peermem ' for module nvidia_peermem: retcode 1
modprobe: ERROR: could not insert 'nvidia_peermem': Invalid argument

root at sapiencia:~# dmesg|tail
[   61.725203] input: Lenovo Bluetooth Mouse Keyboard as /devices/virtual/misc/uhid/0005:17EF:60EF.0002/input/input23
[   61.725269] input: Lenovo Bluetooth Mouse as /devices/virtual/misc/uhid/0005:17EF:60EF.0002/input/input24
[   61.725323] hid-generic 0005:17EF:60EF.0002: input,hidraw1: BLUETOOTH HID v0.15 Mouse [Lenovo Bluetooth Mouse] on ac:ed:5c:df:10:cd
[   62.841702] Bluetooth: hci0: Ignoring error of Inquiry Cancel command
[   86.493960] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[   86.524275] nvidia-uvm: Loaded the UVM driver, major device number 504.
[   98.932325] input: Lenovo Bluetooth Mouse as /devices/virtual/misc/uhid/0005:17EF:60EF.0003/input/input25
[   98.932492] input: Lenovo Bluetooth Mouse Keyboard as /devices/virtual/misc/uhid/0005:17EF:60EF.0003/input/input26
[   98.932577] input: Lenovo Bluetooth Mouse as /devices/virtual/misc/uhid/0005:17EF:60EF.0003/input/input27
[   98.932725] hid-generic 0005:17EF:60EF.0003: input,hidraw1: BLUETOOTH HID v0.15 Mouse [Lenovo Bluetooth Mouse] on ac:ed:5c:df:10:cd

root at sapiencia:~# modprobe -v nvidia-peermem
install modprobe nvidia ; modprobe -i nvidia-current-peermem $CMDLINE_OPTS 
insmod /lib/modules/6.1.0-21-amd64/updates/dkms/nvidia-current-peermem.ko 
modprobe: ERROR: could not insert 'nvidia_current_peermem': Invalid argument
modprobe: ERROR: ../libkmod/libkmod-module.c:1047 command_do() Error running install command 'modprobe nvidia ; modprobe -i nvidia-current-peermem ' for module nvidia_peermem: retcode 1
modprobe: ERROR: could not insert 'nvidia_peermem': Invalid argument

root at sapiencia:~# modprobe -i -v nvidia-current-peermem
insmod /lib/modules/6.1.0-21-amd64/updates/dkms/nvidia-current-peermem.ko 
modprobe: ERROR: could not insert 'nvidia_current_peermem': Invalid argument

root at sapiencia:~# dpkg -l|fgrep nvidia-driver
ii  nvidia-driver                                               535.183.01-1~deb12u1                      amd64        NVIDIA metapackage
ii  nvidia-driver-bin                                           535.183.01-1~deb12u1                      amd64        NVIDIA driver support binaries
ii  nvidia-driver-libs:amd64                                    535.183.01-1~deb12u1                      amd64        NVIDIA metapackage (OpenGL/GLX/EGL/GLES libraries)
root at sapiencia:~#



More information about the pkg-nvidia-devel mailing list