Bug#1088372: ppc64el: kernel panic: corrupted stack end detected inside scheduler after upgrade of nvidia-kernel-dkms

Corentin Labbe clabbe.montjoie at gmail.com
Wed Nov 27 15:07:58 GMT 2024


Package: nvidia-kernel-dkms
Version: 535.183.01-1~deb12u1

Upgrading my bookworm ppc64el machine lead to unsuccessfull boot:
[    4.376878] Kernel panic - not syncing: corrupted stack end detected inside scheduler
[    4.376927] CPU: 5 PID: 797 Comm: nvidia-persiste Tainted: P           O       6.1.0-28-powerpc64le #1  Debian 6.1.119-1
[    4.376972] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1203 opal:skiboot-9858186 PowerNV
[    4.377006] Call Trace:
[    4.377037] [c00020000fa3af10] [c000000000d7a2ec] dump_stack_lvl+0x70/0xa0 (unreliable)
[    4.377083] [c00020000fa3af40] [c000000000137c68] panic+0x16c/0x444
[    4.377131] [c00020000fa3afd0] [c000000000daf00c] __schedule+0xbfc/0xc00
[    4.377179] [c00020000fa3b0b0] [c000000000daf084] schedule+0x74/0x140
[    4.377212] [c00020000fa3b120] [c000000000db7e34] schedule_timeout+0xb4/0x1c0
[    4.377271] [c00020000fa3b1f0] [c008000015544b50] os_delay+0x168/0x3e0 [nvidia]
[    4.377768] [c00020000fa3b2a0] [c0080000164294e0] _nv038220rm+0x20/0x50 [nvidia]
[    4.378313] [c00020000fa3b2d0] [c0080000155ec704] _nv038438rm+0x264/0x530 [nvidia]
[    4.378821] [c00020000fa3b3c0] [c00800001643da08] _nv000655rm+0x13a8/0x2170 [nvidia]
[    4.379309] [c00020000fa3b550] [c008000016432ff4] rm_init_adapter+0x114/0x130 [nvidia]
[    4.379799] [c00020000fa3b640] [c0080000155308a4] nv_open_device+0x60c/0xa40 [nvidia]
[    4.380099] [c00020000fa3b6f0] [c0080000155320e0] nvidia_open+0x2e8/0x570 [nvidia]
[    4.380389] [c00020000fa3b7a0] [c00800001554cde8] nvidia_frontend_open+0xa0/0x120 [nvidia]
[    4.380705] [c00020000fa3b7f0] [c000000000530ac8] chrdev_open+0x178/0x3b0
[    4.380734] [c00020000fa3b860] [c000000000521130] do_dentry_open+0x290/0x560
[    4.380784] [c00020000fa3b8b0] [c000000000544460] path_openat+0xb90/0x15a0
[    4.380803] [c00020000fa3b9a0] [c000000000545b98] do_filp_open+0xc8/0x1a0
[    4.380821] [c00020000fa3bad0] [c000000000523e40] do_sys_openat2+0x110/0x230
[    4.380862] [c00020000fa3bb40] [c000000000524378] sys_openat+0x88/0xe0
[    4.380887] [c00020000fa3bba0] [c00000000002b038] system_call_exception+0x138/0x260
[    4.380900] [c00020000fa3be10] [c00000000000c0f0] system_call_vectored_common+0xf0/0x280
[    4.380938] --- interrupt: 3000 at 0x7fff92b3a948
[    4.380963] NIP:  00007fff92b3a948 LR: 00007fff92b3a948 CTR: 0000000000000000
[    4.380980] REGS: c00020000fa3be80 TRAP: 3000   Tainted: P           O        (6.1.0-28-powerpc64le Debian 6.1.119-1)
[    4.381020] MSR:  900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 44004448  XER: 00000000
[    4.381091] IRQMASK: 0 
[    4.381091] GPR00: 000000000000011e 00007ffffd5dc360 00007fff92c56f00 ffffffffffffff9c 
[    4.381091] GPR04: 00007ffffd5dc400 0000000000080002 0000000000000000 0000000000000030 
[    4.381091] GPR08: 00007fff928e01b8 0000000000000000 0000000000000000 0000000000000000 
[    4.381091] GPR12: 0000000000000000 00007fff92f4dc40 0000000000000000 0000000000000000 
[    4.381091] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[    4.381091] GPR20: 0000000000000000 0000000000000000 0000000000000000 000000000000ffff 
[    4.381091] GPR24: 00007fff92903398 0000000000000000 00007fff92902a48 00007fff92902a48 
[    4.381091] GPR28: 00007ffffd5dc53c 0000000000000000 0000000000080002 00007fff92903398 
[    4.381305] NIP [00007fff92b3a948] 0x7fff92b3a948
[    4.381334] LR [00007fff92b3a948] 0x7fff92b3a948
[    4.381352] --- interrupt: 3000
[    4.415728] bridge: filt

This happen with both 6.1.0-28-powerpc64le and 6.1.0-21-powerpc64le.
Removing /lib/modules/6.1.0-28-powerpc64le/updates/dkms/* was needed to recover.

System was working perfectly with previous 525.147.05-7~deb12u1

Regards



More information about the pkg-nvidia-devel mailing list