Bug#908568: nvidia-driver: build error

Andreas Beckmann anbe at debian.org
Fri Sep 14 12:15:07 BST 2018


On 2018-09-12 10:18, Vincent Lefevre wrote:
> On 2018-09-11 19:15:25 -0700, Russ Allbery wrote:
>> If you set IGNORE_CC_MISMATCH=1 in the environment before installing the
>> package, does everything build and work correctly?
> 
> Yes, everything is fine.

It is probably very fragile. When upgrading from stretch/gcc-6 to 
sid/gcc-6 the whole toolchain gets updated, too (binutils ...).

I have (for an older stretch kernel and cannot reboot to the current 
one, but that should not matter) for 396.45 two modules - one built in a 
mixed sid environment, unloadable, and a working one built in clean 
stretch.


# insmod ./sid/lib/modules/4.9.0-6-amd64/nvidia/nvidia-current.ko
insmod: ERROR: could not insert module ./sid/lib/modules/4.9.0-6-amd64/nvidia/nvidia-current.ko: Invalid module format

# dmesg | tail -n 1
[12261789.197235] module: nvidia: Unknown rela relocation: 4

# insmod ./stretch/lib/modules/4.9.0-6-amd64/nvidia/nvidia-current.ko

# dmesg | tail -n 3
[12261835.731804] nvidia-nvlink: Nvlink Core is being initialized, major device number 245
[12261835.732157] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=none,decodes=none:owns=io+mem
[12261835.732275] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  396.45  Thu Jul 12 20:49:29 PDT 2018 (using threaded interrupts)

Looking with diffoscope at the modules I find this interesting
difference:

│ -Relocation section '.rela.text' at offset 0x122def8 contains 138066 entries:
│ +Relocation section '.rela.text' at offset 0x122de38 contains 138066 entries:
│      Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
│ -0000000000000001  00008fdb00000002 R_X86_64_PC32          0000000000000000 __fentry__ - 4
│ +0000000000000001  00008fdb00000004 R_X86_64_PLT32         0000000000000000 __fentry__ - 4
│  0000000000000015  00008f760000000b R_X86_64_32S           0000000000000000 nv_minor_num_table + 0
│ -0000000000000028  00008fed00000002 R_X86_64_PC32          0000000000000000 __x86_indirect_thunk_rax - 4
│ -0000000000000031  00008fdb00000002 R_X86_64_PC32          0000000000000000 __fentry__ - 4
│ +0000000000000028  00008fed00000004 R_X86_64_PLT32         0000000000000000 __x86_indirect_thunk_rax - 4
│ +0000000000000031  00008fdb00000004 R_X86_64_PLT32         0000000000000000 __fentry__ - 4
│  0000000000000048  00008f760000000b R_X86_64_32S           0000000000000000 nv_minor_num_table + 0
│ -000000000000006a  0000919e00000002 R_X86_64_PC32          0000000000000000 __x86_indirect_thunk_r9 - 4
│ -0000000000000081  00008fdb00000002 R_X86_64_PC32          0000000000000000 __fentry__ - 4
│ -0000000000000091  00008fdb00000002 R_X86_64_PC32          0000000000000000 __fentry__ - 4
│ +000000000000006a  0000919e00000004 R_X86_64_PLT32         0000000000000000 __x86_indirect_thunk_r9 - 4
│ +0000000000000081  00008fdb00000004 R_X86_64_PLT32         0000000000000000 __fentry__ - 4
│ +0000000000000091  00008fdb00000004 R_X86_64_PLT32         0000000000000000 __fentry__ - 4
...

arch/x86/include/asm/elf.h says:
#define R_X86_64_PC32		2	/* PC relative 32 bit signed */
#define R_X86_64_PLT32		4	/* 32 bit PLT address */

The same behavior I see comparing the modules built for current
driver and stretch kernel in different environments, therefore
I conclude that they won't work if built with the mismatching
toolchain (mismatching gcc may be just an indicator for the
mismatching toolchain, using a gcc6 6.4.0 built in stretch might
actually result in working modules).

--- nvidia-kernel-4.9.0-8-amd64_390.87-1+4.9.110-3+deb9u4_amd64_gcc-6.3.0-18+deb9u1.deb
+++ nvidia-kernel-4.9.0-8-amd64_390.87-1+4.9.110-3+deb9u4_amd64_gcc-6.4.0-20.deb

-Relocation section '.rela.text' at offset 0x253768 contains 701 entries:
+Relocation section '.rela.text' at offset 0x253708 contains 701 entries:
     Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
-0000000000000001  000000c200000002 R_X86_64_PC32          0000000000000000 __fentry__ - 4
+0000000000000001  000000c200000004 R_X86_64_PLT32         0000000000000000 __fentry__ - 4
 0000000000000008  0000000f0000000b R_X86_64_32S           0000000000000000 .data + 0
-000000000000000d  0000012500000002 R_X86_64_PC32          0000000000000000 nvKmsKapiGetFunctionsTable - 4
-0000000000000016  000000e500000002 R_X86_64_PC32          0000000000000950 nv_drm_probe_devices - 4
+000000000000000d  0000012500000004 R_X86_64_PLT32         0000000000000000 nvKmsKapiGetFunctionsTable - 4
+0000000000000016  000000e500000004 R_X86_64_PLT32         0000000000000950 nv_drm_probe_devices - 4
...

build test environment:
stretch chroot + linux-headers-amd64 module-assistant nvidia-kernel-source
m-a -t -f -l 4.9.0-8-amd64 build nvidia
add sid sources
apt-get install gcc-6/sid
m-a -t -f -l 4.9.0-8-amd64 build nvidia


Andreas



More information about the pkg-nvidia-devel mailing list