Bug#869079: Kernel NULL Pointer Deference During cudaMalloc()

Alex Richman a.richman at gblabs.co.uk
Thu Jul 20 10:39:35 UTC 2017


Package: nvidia-driver

I've only hit this bug once so far, and have been running the same code 
for weeks now so it's pretty rare.  Not certain but I believe this was 
during a large cudaMalloc().
Don't have the hardware to replicate it on another non-Debian system 
ATM, so unfortunately can't confirm if the problem is specific to Debian.


Relevant package versions:
kernel: Linux DEV-AI 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2 (2017-04-30) 
x86_64 GNU/Linux
nvidia-375: 375.26-0ubuntu1
cuda: 8.0.61-1
libcuda: 375.26-0ubuntu1
cudnn: 8.0-linux-x64-v5.1


Hardware:
motherboard: Supermicro X10SRL-F
cpu: Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz
gpu: MSI GeForce GTX 1080


Syslog:
Jul 20 10:26:19 DEV-AI kernel: [86556.439096] nvidia 0000:03:00.0: irq 
86 for MSI/MSI-X
Jul 20 10:26:20 DEV-AI kernel: [86557.041036] BUG: unable to handle 
kernel NULL pointer dereference at 0000000000000010
Jul 20 10:26:20 DEV-AI kernel: [86557.041062] IP: [<ffffffffa06ef6c5>] 
nv_kthread_q_init+0x85/0xb0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.041091] PGD 463a3a067 PUD 
463bcc067 PMD 0
Jul 20 10:26:20 DEV-AI kernel: [86557.041105] Oops: 0002 [#1] SMP
Jul 20 10:26:20 DEV-AI kernel: [86557.041115] Modules linked in: 
nvidia_uvm(PO) binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs 
lockd fscache sunrpc joydev hid_generic usbhid hid snd_hda_codec_hdmi 
nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) x86_pkg_temp_thermal 
coretemp kvm_intel snd_hda_intel kvm snd_hda_controller snd_hda_codec 
snd_hwdep snd_pcm ast iTCO_wdt crc32_pclmul iTCO_vendor_support ttm 
snd_timer evdev drm_kms_helper snd aesni_intel soundcore aes_x86_64 drm 
lrw gf128mul glue_helper mei_me ablk_helper lpc_ich cryptd pcspkr mei 
mfd_core shpchp wmi tpm_tis tpm ipmi_si ipmi_msghandler processor 
thermal_sys acpi_power_meter button acpi_pad fuse autofs4 ext4 crc16 
mbcache jbd2 sg sd_mod crc_t10dif crct10dif_generic ahci igb libahci 
i2c_algo_bit i2c_i801 dca ptp crct10dif_pclmul crct10dif_common 
crc32c_intel ehci_pci xhci_hcd libata ehci_hcd usbcore i2c_core scsi_mod 
pps_core usb_common
Jul 20 10:26:20 DEV-AI kernel: [86557.041363] CPU: 1 PID: 11827 Comm: 
telescope Tainted: P           O  3.16.0-4-amd64 #1 Debian 3.16.43-2
Jul 20 10:26:20 DEV-AI kernel: [86557.041386] Hardware name: Supermicro 
Super Server/X10SRL-F, BIOS 2.0a 08/01/2016
Jul 20 10:26:20 DEV-AI kernel: [86557.041405] task: ffff88045943a210 ti: 
ffff88046b894000 task.ti: ffff88046b894000
Jul 20 10:26:20 DEV-AI kernel: [86557.041423] RIP: 
0010:[<ffffffffa06ef6c5>]  [<ffffffffa06ef6c5>] 
nv_kthread_q_init+0x85/0xb0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.041451] RSP: 
0018:ffff88046b897c38  EFLAGS: 00010206
Jul 20 10:26:20 DEV-AI kernel: [86557.041464] RAX: fffffffffffffffc RBX: 
ffff8804537564a8 RCX: ffff88046b897bc0
Jul 20 10:26:20 DEV-AI kernel: [86557.041481] RDX: ffff88046b897bb0 RSI: 
0000000000000246 RDI: ffff88046b897bb8
Jul 20 10:26:20 DEV-AI kernel: [86557.041498] RBP: fffffffffffffffc R08: 
0000000000200008 R09: 0000000000000001
Jul 20 10:26:20 DEV-AI kernel: [86557.041515] R10: 000000000000023e R11: 
0000000000000010 R12: ffff880453756024
Jul 20 10:26:20 DEV-AI kernel: [86557.041532] R13: ffff880453756000 R14: 
ffff880453756240 R15: ffff880453756308
Jul 20 10:26:20 DEV-AI kernel: [86557.041550] FS: 00007fc63f0b1a00(0000) 
GS:ffff88047fc40000(0000) knlGS:0000000000000000
Jul 20 10:26:20 DEV-AI kernel: [86557.041569] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Jul 20 10:26:20 DEV-AI kernel: [86557.041583] CR2: 0000000000000010 CR3: 
000000046a8dd000 CR4: 00000000003407e0
Jul 20 10:26:20 DEV-AI kernel: [86557.041601] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Jul 20 10:26:20 DEV-AI kernel: [86557.042124] DR3: 0000000000000000 DR6: 
00000000fffe0ff0 DR7: 0000000000000400
Jul 20 10:26:20 DEV-AI kernel: [86557.042637] Stack:
Jul 20 10:26:20 DEV-AI kernel: [86557.043148]  0000000000000000 
ffff88046b897cf8 ffffffffa0705822 0000000000000890
Jul 20 10:26:20 DEV-AI kernel: [86557.043667]  0000000200000884 
ffff88046e6ab000 0000000000000000 00007fffffffffff
Jul 20 10:26:20 DEV-AI kernel: [86557.044176]  ffffff0031316ed8 
ffffffff811a30f1 472d4d565591b9d0 64323739342d5550
Jul 20 10:26:20 DEV-AI kernel: [86557.044696] Call Trace:
Jul 20 10:26:20 DEV-AI kernel: [86557.045221] [<ffffffffa0705822>] ? 
add_gpu+0x9a2/0xbc0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.045760] [<ffffffff811a30f1>] ? 
mem_cgroup_update_page_stat+0x11/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.046310] [<ffffffffa0705a83>] ? 
uvm_gpu_retain_by_uuid_locked+0x43/0x50 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.046873] [<ffffffffa0705ab4>] ? 
uvm_gpu_retain_by_uuid+0x24/0x40 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.047440] [<ffffffffa070715d>] ? 
uvm_va_space_register_gpu+0x1d/0x180 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.048016] [<ffffffffa06ff219>] ? 
uvm_unlocked_ioctl+0x499/0xdf0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.048597] [<ffffffff8116c13c>] ? 
handle_mm_fault+0x63c/0x11c0
Jul 20 10:26:20 DEV-AI kernel: [86557.049185] [<ffffffff81058321>] ? 
__do_page_fault+0x1d1/0x4f0
Jul 20 10:26:20 DEV-AI kernel: [86557.049776] [<ffffffff810c9302>] ? 
get_monotonic_boottime+0x42/0xf0
Jul 20 10:26:20 DEV-AI kernel: [86557.050373] [<ffffffff811bd5bf>] ? 
do_vfs_ioctl+0x2cf/0x4b0
Jul 20 10:26:20 DEV-AI kernel: [86557.050975] [<ffffffff8107bcb0>] ? 
SYSC_sysinfo+0x20/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.051581] [<ffffffff811bd821>] ? 
SyS_ioctl+0x81/0xa0
Jul 20 10:26:20 DEV-AI kernel: [86557.052188] [<ffffffff8151c428>] ? 
page_fault+0x28/0x30
Jul 20 10:26:20 DEV-AI kernel: [86557.052771] [<ffffffff8151a40d>] ? 
system_call_fast_compare_end+0x10/0x15
Jul 20 10:26:20 DEV-AI kernel: [86557.053362] Code: c0 c7 43 18 00 00 00 
00 c7 43 1c 00 00 00 00 e8 42 9c 99 e0 48 3d 00 f0 ff ff 48 89 c5 77 08 
48 89 c7 e8 5f 8c 9a e0 48 89 6b 38 <81> 4d 14 00 80 00 00 31 c0 48 8b 
53 38 48 81 fa 00 f0 ff ff 77
Jul 20 10:26:20 DEV-AI kernel: [86557.054642] RIP [<ffffffffa06ef6c5>] 
nv_kthread_q_init+0x85/0xb0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.055272]  RSP <ffff88046b897c38>
Jul 20 10:26:20 DEV-AI kernel: [86557.055931] CR2: 0000000000000010
Jul 20 10:26:20 DEV-AI kernel: [86557.057013] ------------[ cut here 
]------------
Jul 20 10:26:20 DEV-AI kernel: [86557.057655] kernel BUG at 
/build/linux-2PgYgH/linux-3.16.43/arch/x86/mm/pageattr.c:216!
Jul 20 10:26:20 DEV-AI kernel: [86557.058267] invalid opcode: 0000 [#2] SMP
Jul 20 10:26:20 DEV-AI kernel: [86557.058864] Modules linked in: 
nvidia_uvm(PO) binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs 
lockd fscache sunrpc joydev hid_generic usbhid hid snd_hda_codec_hdmi 
nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) x86_pkg_temp_thermal 
coretemp kvm_intel snd_hda_intel kvm snd_hda_controller snd_hda_codec 
snd_hwdep snd_pcm ast iTCO_wdt crc32_pclmul iTCO_vendor_support ttm 
snd_timer evdev drm_kms_helper snd aesni_intel soundcore aes_x86_64 drm 
lrw gf128mul glue_helper mei_me ablk_helper lpc_ich cryptd pcspkr mei 
mfd_core shpchp wmi tpm_tis tpm ipmi_si ipmi_msghandler processor 
thermal_sys acpi_power_meter button acpi_pad fuse autofs4 ext4 crc16 
mbcache jbd2 sg sd_mod crc_t10dif crct10dif_generic ahci igb libahci 
i2c_algo_bit i2c_i801 dca ptp crct10dif_pclmul crct10dif_common 
crc32c_intel ehci_pci xhci_hcd libata ehci_hcd usbcore i2c_core scsi_mod 
pps_core usb_common
Jul 20 10:26:20 DEV-AI kernel: [86557.063258] CPU: 1 PID: 11827 Comm: 
telescope Tainted: P           O  3.16.0-4-amd64 #1 Debian 3.16.43-2
Jul 20 10:26:20 DEV-AI kernel: [86557.063909] Hardware name: Supermicro 
Super Server/X10SRL-F, BIOS 2.0a 08/01/2016
Jul 20 10:26:20 DEV-AI kernel: [86557.064584] task: ffff88045943a210 ti: 
ffff88046b894000 task.ti: ffff88046b894000
Jul 20 10:26:20 DEV-AI kernel: [86557.065197] RIP: 
0010:[<ffffffff8105b0ab>]  [<ffffffff8105b0ab>] 
change_page_attr_set_clr+0x43b/0x440
Jul 20 10:26:20 DEV-AI kernel: [86557.065823] RSP: 
0018:ffff88046b897118  EFLAGS: 00010046
Jul 20 10:26:20 DEV-AI kernel: [86557.066446] RAX: 0000000000000046 RBX: 
0000000000000000 RCX: ffff88046b897130
Jul 20 10:26:20 DEV-AI kernel: [86557.067077] RDX: 0000000000000000 RSI: 
0000000000000000 RDI: 0000000080000000
Jul 20 10:26:20 DEV-AI kernel: [86557.067724] RBP: 0000000000000000 R08: 
0000000000000001 R09: ffff880000000000
Jul 20 10:26:20 DEV-AI kernel: [86557.068404] R10: ffff88046ee6e7d8 R11: 
0000000000000000 R12: 0000000000000010
Jul 20 10:26:20 DEV-AI kernel: [86557.069039] R13: 0000000000000004 R14: 
0000000000000005 R15: 0000000000000010
Jul 20 10:26:20 DEV-AI kernel: [86557.069647] FS: 00007fc63f0b1a00(0000) 
GS:ffff88047fc40000(0000) knlGS:0000000000000000
Jul 20 10:26:20 DEV-AI kernel: [86557.070260] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Jul 20 10:26:20 DEV-AI kernel: [86557.070876] CR2: 0000000000000010 CR3: 
000000046a8dd000 CR4: 00000000003407e0
Jul 20 10:26:20 DEV-AI kernel: [86557.071508] DR0: 0000000000000000 DR1: 
0000000000000000 DR2: 0000000000000000
Jul 20 10:26:20 DEV-AI kernel: [86557.072127] DR3: 0000000000000000 DR6: 
00000000fffe0ff0 DR7: 0000000000000400
Jul 20 10:26:20 DEV-AI kernel: [86557.072736] Stack:
Jul 20 10:26:20 DEV-AI kernel: [86557.073336]  0000000000000000 
0000000000000200 0000000000000000 0000000000000000
Jul 20 10:26:20 DEV-AI kernel: [86557.073953]  0000000000000000 
0000000000000010 0000000000000000 0000000000000001
Jul 20 10:26:20 DEV-AI kernel: [86557.074558]  0000000000000005 
000000000006c2b0 0000020000000000 ffff88046d875000
Jul 20 10:26:20 DEV-AI kernel: [86557.075158] Call Trace:
Jul 20 10:26:20 DEV-AI kernel: [86557.075767] [<ffffffff8105b3ed>] ? 
_set_pages_array+0xfd/0x150
Jul 20 10:26:20 DEV-AI kernel: [86557.076366] [<ffffffffa035ed77>] ? 
ttm_set_pages_caching+0x27/0x60 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.076967] [<ffffffffa035ee6a>] ? 
ttm_alloc_new_pages.isra.6+0xba/0x190 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.077569] [<ffffffffa035f7c2>] ? 
ttm_pool_populate+0x3e2/0x4f0 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.078167] [<ffffffffa035c527>] ? 
ttm_bo_move_memcpy+0x5b7/0x650 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.078763] [<ffffffff8117877a>] ? 
map_vm_area+0x2a/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.079360] [<ffffffffa0359cb6>] ? 
ttm_bo_handle_move_mem+0x266/0x5c0 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.080056] [<ffffffffa0b5a053>] ? 
_nv003119rm+0xe3/0x420 [nvidia]
Jul 20 10:26:20 DEV-AI kernel: [86557.080658] [<ffffffffa035a667>] ? 
ttm_bo_mem_space+0x107/0x320 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.081262] [<ffffffffa035ad19>] ? 
ttm_bo_validate+0x1f9/0x210 [ttm]
Jul 20 10:26:20 DEV-AI kernel: [86557.081867] [<ffffffffa03c41f8>] ? 
ast_bo_push_sysram+0x78/0xd0 [ast]
Jul 20 10:26:20 DEV-AI kernel: [86557.082474] [<ffffffffa03c1dfd>] ? 
ast_crtc_do_set_base.isra.14.constprop.24+0x6d/0x310 [ast]
Jul 20 10:26:20 DEV-AI kernel: [86557.083087] [<ffffffffa03c0261>] ? 
ast_set_index_reg_mask+0x41/0x70 [ast]
Jul 20 10:26:20 DEV-AI kernel: [86557.083689] [<ffffffffa03c2b57>] ? 
ast_crtc_mode_set+0xab7/0xc10 [ast]
Jul 20 10:26:20 DEV-AI kernel: [86557.084281] [<ffffffffa02c28d9>] ? 
drm_crtc_helper_set_mode+0x2e9/0x520 [drm_kms_helper]
Jul 20 10:26:20 DEV-AI kernel: [86557.084913] [<ffffffffa02c3678>] ? 
drm_crtc_helper_set_config+0x8a8/0xad0 [drm_kms_helper]
Jul 20 10:26:20 DEV-AI kernel: [86557.085502] [<ffffffffa038fc91>] ? 
drm_mode_set_config_internal+0x61/0xe0 [drm]
Jul 20 10:26:20 DEV-AI kernel: [86557.086091] [<ffffffffa02c616b>] ? 
drm_fb_helper_pan_display+0x8b/0xe0 [drm_kms_helper]
Jul 20 10:26:20 DEV-AI kernel: [86557.086685] [<ffffffff813148e1>] ? 
fb_pan_display+0xb1/0x170
Jul 20 10:26:20 DEV-AI kernel: [86557.087271] [<ffffffff8130f3da>] ? 
bit_update_start+0x1a/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.087874] [<ffffffff8130eddd>] ? 
fbcon_switch+0x37d/0x520
Jul 20 10:26:20 DEV-AI kernel: [86557.088478] [<ffffffff8137f437>] ? 
redraw_screen+0x177/0x230
Jul 20 10:26:20 DEV-AI kernel: [86557.089048] [<ffffffff81314acf>] ? 
fb_blank+0x9f/0xc0
Jul 20 10:26:20 DEV-AI kernel: [86557.089576] [<ffffffff8130c3da>] ? 
fbcon_blank+0x1fa/0x2b0
Jul 20 10:26:20 DEV-AI kernel: [86557.090088] [<ffffffff810ba718>] ? 
console_unlock+0x268/0x440
Jul 20 10:26:20 DEV-AI kernel: [86557.090598] [<ffffffff810ba4a0>] ? 
wake_up_klogd+0x30/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.091087] [<ffffffff810740d6>] ? 
lock_timer_base.isra.35+0x26/0x50
Jul 20 10:26:20 DEV-AI kernel: [86557.091554] [<ffffffff8107391a>] ? 
internal_add_timer+0x2a/0x70
Jul 20 10:26:20 DEV-AI kernel: [86557.092007] [<ffffffff81075c25>] ? 
mod_timer+0xf5/0x200
Jul 20 10:26:20 DEV-AI kernel: [86557.092448] [<ffffffff8137ff21>] ? 
do_unblank_screen+0xb1/0x1d0
Jul 20 10:26:20 DEV-AI kernel: [86557.092885] [<ffffffff812bcab5>] ? 
bust_spinlocks+0x15/0x30
Jul 20 10:26:20 DEV-AI kernel: [86557.093310] [<ffffffff8101723f>] ? 
oops_end+0x2f/0xe0
Jul 20 10:26:20 DEV-AI kernel: [86557.093724] [<ffffffff81510ac8>] ? 
no_context+0x2b2/0x2be
Jul 20 10:26:20 DEV-AI kernel: [86557.094134] [<ffffffff810581d0>] ? 
__do_page_fault+0x80/0x4f0
Jul 20 10:26:20 DEV-AI kernel: [86557.094602] [<ffffffffa0d9e498>] ? 
_nv012156rm+0x8/0x40 [nvidia]
Jul 20 10:26:20 DEV-AI kernel: [86557.095018] [<ffffffff8109d410>] ? 
select_task_rq_fair+0x390/0x700
Jul 20 10:26:20 DEV-AI kernel: [86557.095436] [<ffffffff8109fb1f>] ? 
enqueue_task_fair+0x2cf/0xe20
Jul 20 10:26:20 DEV-AI kernel: [86557.095854] [<ffffffff81095bd5>] ? 
check_preempt_curr+0x75/0xa0
Jul 20 10:26:20 DEV-AI kernel: [86557.096267] [<ffffffff81095c14>] ? 
ttwu_do_wakeup+0x14/0xf0
Jul 20 10:26:20 DEV-AI kernel: [86557.096673] [<ffffffff81517b06>] ? 
wait_for_completion_killable+0x26/0x180
Jul 20 10:26:20 DEV-AI kernel: [86557.097079] [<ffffffff8151c428>] ? 
page_fault+0x28/0x30
Jul 20 10:26:20 DEV-AI kernel: [86557.097487] [<ffffffffa06ef6c5>] ? 
nv_kthread_q_init+0x85/0xb0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.097906] [<ffffffffa06ef6ae>] ? 
nv_kthread_q_init+0x6e/0xb0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.098314] [<ffffffffa0705822>] ? 
add_gpu+0x9a2/0xbc0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.098715] [<ffffffff811a30f1>] ? 
mem_cgroup_update_page_stat+0x11/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.099127] [<ffffffffa0705a83>] ? 
uvm_gpu_retain_by_uuid_locked+0x43/0x50 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.099548] [<ffffffffa0705ab4>] ? 
uvm_gpu_retain_by_uuid+0x24/0x40 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.099963] [<ffffffffa070715d>] ? 
uvm_va_space_register_gpu+0x1d/0x180 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.100377] [<ffffffffa06ff219>] ? 
uvm_unlocked_ioctl+0x499/0xdf0 [nvidia_uvm]
Jul 20 10:26:20 DEV-AI kernel: [86557.100789] [<ffffffff8116c13c>] ? 
handle_mm_fault+0x63c/0x11c0
Jul 20 10:26:20 DEV-AI kernel: [86557.101200] [<ffffffff81058321>] ? 
__do_page_fault+0x1d1/0x4f0
Jul 20 10:26:20 DEV-AI kernel: [86557.101620] [<ffffffff810c9302>] ? 
get_monotonic_boottime+0x42/0xf0
Jul 20 10:26:20 DEV-AI kernel: [86557.102030] [<ffffffff811bd5bf>] ? 
do_vfs_ioctl+0x2cf/0x4b0
Jul 20 10:26:20 DEV-AI kernel: [86557.102435] [<ffffffff8107bcb0>] ? 
SYSC_sysinfo+0x20/0x40
Jul 20 10:26:20 DEV-AI kernel: [86557.102837] [<ffffffff811bd821>] ? 
SyS_ioctl+0x81/0xa0
Jul 20 10:26:20 DEV-AI kernel: [86557.103234] [<ffffffff8151c428>] ? 
page_fault+0x28/0x30
Jul 20 10:26:20 DEV-AI kernel: [86557.103630] [<ffffffff8151a40d>] ? 
system_call_fast_compare_end+0x10/0x15
Jul 20 10:26:20 DEV-AI kernel: [86557.104074] Code: 87 00 01 44 8b 44 24 
0c 48 8b 0c 24 e9 99 fc ff ff 0f 0b 0f 0b be ba 00 00 00 48 c7 c7 90 50 
71 81 e8 8a d8 00 00 e9 1b ff ff ff <0f> 0b 0f 1f 00 0f 1f 44 00 00 41 
57 49 89 ff 41 56 41 55 41 54
Jul 20 10:26:20 DEV-AI kernel: [86557.105016] RIP [<ffffffff8105b0ab>] 
change_page_attr_set_clr+0x43b/0x440
Jul 20 10:26:20 DEV-AI kernel: [86557.105442]  RSP <ffff88046b897118>
Jul 20 10:26:20 DEV-AI kernel: [86557.105862] ---[ end trace 
e3ea8d0c2705a799 ]---


Only other relevant thing I can think of is that the nvidia-smi output 
froze on this:
+-----------------------------------------------------------------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile 
Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util 
Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:03:00.0     Off 
|                  N/A |
|  0%   25C    P8     8W / 240W |      2MiB /  8113MiB | 0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
|  GPU       PID  Type  Process name Usage      |
|=============================================================================|
|  No running processes 
found                                                 |
+-----------------------------------------------------------------------------+


Thanks,
- Alex.

-- 
Alex Richman
Software Development
GB Labs
2 Orpheus House,
Calleva park,
Reading
RG7 8TA
Tel:+44 (0)118 455 5000
www.gblabs.com



The information contained in this message and any attachment may be proprietary, confidential and privileged.
If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering
this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying
of this communication is strictly prohibited. If you received this communication in error, please contact me immediately,
and delete the communication (including attachments, if applicable) from any computer or network system.



More information about the pkg-nvidia-devel mailing list