Bug#1074350: nvidia-kernel-dkms: Trying to modprobe nvidia-peermem to use NCCL/RDMA/Infiniband with GPUs
Jeffrey Mark Siskind
qobi at qobi.org
Thu Jul 4 05:19:28 BST 2024
So I don't know why the module doesn't load.
Any ideas?
I figured it out. doca-ofed aka MLNX_OFED needs to have
openibd.service running. It failed because opensmd.service was
running. For some reason, it hung when I tried to stop opensmd.service.
I rebooted and then nvidia-peermem loaded.
Jeff (http: //engineering.purdue.edu/~qobi)
More information about the pkg-nvidia-devel
mailing list