Bug#1088372: ppc64el: kernel panic: corrupted stack end detected inside scheduler after upgrade of nvidia-kernel-dkms
Corentin Labbe
clabbe.montjoie at gmail.com
Wed Nov 27 15:07:58 GMT 2024
Package: nvidia-kernel-dkms
Version: 535.183.01-1~deb12u1
Upgrading my bookworm ppc64el machine lead to unsuccessfull boot:
[ 4.376878] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[ 4.376927] CPU: 5 PID: 797 Comm: nvidia-persiste Tainted: P O 6.1.0-28-powerpc64le #1 Debian 6.1.119-1
[ 4.376972] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1203 opal:skiboot-9858186 PowerNV
[ 4.377006] Call Trace:
[ 4.377037] [c00020000fa3af10] [c000000000d7a2ec] dump_stack_lvl+0x70/0xa0 (unreliable)
[ 4.377083] [c00020000fa3af40] [c000000000137c68] panic+0x16c/0x444
[ 4.377131] [c00020000fa3afd0] [c000000000daf00c] __schedule+0xbfc/0xc00
[ 4.377179] [c00020000fa3b0b0] [c000000000daf084] schedule+0x74/0x140
[ 4.377212] [c00020000fa3b120] [c000000000db7e34] schedule_timeout+0xb4/0x1c0
[ 4.377271] [c00020000fa3b1f0] [c008000015544b50] os_delay+0x168/0x3e0 [nvidia]
[ 4.377768] [c00020000fa3b2a0] [c0080000164294e0] _nv038220rm+0x20/0x50 [nvidia]
[ 4.378313] [c00020000fa3b2d0] [c0080000155ec704] _nv038438rm+0x264/0x530 [nvidia]
[ 4.378821] [c00020000fa3b3c0] [c00800001643da08] _nv000655rm+0x13a8/0x2170 [nvidia]
[ 4.379309] [c00020000fa3b550] [c008000016432ff4] rm_init_adapter+0x114/0x130 [nvidia]
[ 4.379799] [c00020000fa3b640] [c0080000155308a4] nv_open_device+0x60c/0xa40 [nvidia]
[ 4.380099] [c00020000fa3b6f0] [c0080000155320e0] nvidia_open+0x2e8/0x570 [nvidia]
[ 4.380389] [c00020000fa3b7a0] [c00800001554cde8] nvidia_frontend_open+0xa0/0x120 [nvidia]
[ 4.380705] [c00020000fa3b7f0] [c000000000530ac8] chrdev_open+0x178/0x3b0
[ 4.380734] [c00020000fa3b860] [c000000000521130] do_dentry_open+0x290/0x560
[ 4.380784] [c00020000fa3b8b0] [c000000000544460] path_openat+0xb90/0x15a0
[ 4.380803] [c00020000fa3b9a0] [c000000000545b98] do_filp_open+0xc8/0x1a0
[ 4.380821] [c00020000fa3bad0] [c000000000523e40] do_sys_openat2+0x110/0x230
[ 4.380862] [c00020000fa3bb40] [c000000000524378] sys_openat+0x88/0xe0
[ 4.380887] [c00020000fa3bba0] [c00000000002b038] system_call_exception+0x138/0x260
[ 4.380900] [c00020000fa3be10] [c00000000000c0f0] system_call_vectored_common+0xf0/0x280
[ 4.380938] --- interrupt: 3000 at 0x7fff92b3a948
[ 4.380963] NIP: 00007fff92b3a948 LR: 00007fff92b3a948 CTR: 0000000000000000
[ 4.380980] REGS: c00020000fa3be80 TRAP: 3000 Tainted: P O (6.1.0-28-powerpc64le Debian 6.1.119-1)
[ 4.381020] MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44004448 XER: 00000000
[ 4.381091] IRQMASK: 0
[ 4.381091] GPR00: 000000000000011e 00007ffffd5dc360 00007fff92c56f00 ffffffffffffff9c
[ 4.381091] GPR04: 00007ffffd5dc400 0000000000080002 0000000000000000 0000000000000030
[ 4.381091] GPR08: 00007fff928e01b8 0000000000000000 0000000000000000 0000000000000000
[ 4.381091] GPR12: 0000000000000000 00007fff92f4dc40 0000000000000000 0000000000000000
[ 4.381091] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 4.381091] GPR20: 0000000000000000 0000000000000000 0000000000000000 000000000000ffff
[ 4.381091] GPR24: 00007fff92903398 0000000000000000 00007fff92902a48 00007fff92902a48
[ 4.381091] GPR28: 00007ffffd5dc53c 0000000000000000 0000000000080002 00007fff92903398
[ 4.381305] NIP [00007fff92b3a948] 0x7fff92b3a948
[ 4.381334] LR [00007fff92b3a948] 0x7fff92b3a948
[ 4.381352] --- interrupt: 3000
[ 4.415728] bridge: filt
This happen with both 6.1.0-28-powerpc64le and 6.1.0-21-powerpc64le.
Removing /lib/modules/6.1.0-28-powerpc64le/updates/dkms/* was needed to recover.
System was working perfectly with previous 525.147.05-7~deb12u1
Regards
More information about the pkg-nvidia-devel
mailing list