Bug#1074350: nvidia-kernel-dkms: Trying to modprobe nvidia-peermem to use NCCL/RDMA/Infiniband with GPUs
    Jeffrey Mark Siskind 
    qobi at qobi.org
       
    Thu Jul  4 05:19:28 BST 2024
    
    
  
   So I don't know why the module doesn't load.
   Any ideas?
I figured it out. doca-ofed aka MLNX_OFED needs to have
openibd.service running. It failed because opensmd.service was
running. For some reason, it hung when I tried to stop opensmd.service.
I rebooted and then nvidia-peermem loaded.
    Jeff (http: //engineering.purdue.edu/~qobi)
    
    
More information about the pkg-nvidia-devel
mailing list