Bug#889669: nvidia-graphics-drivers: solve the upgrade problem

Philipp Kern pkern at debian.org
Mon Feb 5 15:26:11 UTC 2018


Source: nvidia-graphics-drivers

Since forever users of NVIDIA on Debian accepted that package upgrades
break newly spawned binaries because the interface between the client
library and the kernel driver is strictly versioned. The kernel module
will emit an API mismatch error into the kernel log and GLX requests
will fail. A reboot is required to remediate this situation.

I would propose the following model:

* All binary packages that require strict versioning with NVRM are
shipped in versioned packages. This means that the library package names
reflect both major and minor version (= the version on which the driver
checks) of the driver. The resulting packages should be co-installable
with each other.
* An script modifies the symlink for the currently active libraries to
point to the version of the currently loaded nvidia module (as fetched
from sysfs's /sys/module/nvidia/version). This script is called on
installation but more crucially on every boot. This will tie the
libraries to the module loaded at boot-up.
* The kernel module itself does not have to be versioned. The kernel
module can be upgraded and it will end up in the initrd automatically.

Assuming that we have a metapackage that pulls in the most recent driver
(like linux-image does), this model would allow to upgrade the driver at
any point in time and only make it live with the next reboot. This
allows applications to continue to function.

This approach has the drawback that every update from NVIDIA needs to go
through NEW. However I think this is just a theoretical disadvantage at
this point as NEW latency for ABI version changes has decreased a lot.

The thing I'm not sure about is how this proposal interacts with the
legacy modules. I suppose they can all use the same mechanism but the
script would need to be aware what library stack needs to be chosen. The
NVIDIA kernel shim already checks using rm_is_supported_device if the
currently installed device is supported. That together with modalias
should supposedly already load the correct module and then the script
could just check which of the modules (if legacy or the normal one) is
loaded and act accordingly.

Do you think this would be workable? The NVIDIA packaging is quite a
beast to handle, I know (and I'm very grateful for your work!). So we
should have some consensus if this is something you'd be interested in. :)

Kind regards and thanks
Philipp Kern

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 512 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-nvidia-devel/attachments/20180205/c26a27fa/attachment.sig>


More information about the pkg-nvidia-devel mailing list