Bug#889669: nvidia-graphics-drivers: solve the upgrade problem

Tue May 29 10:02:13 BST 2018

I've got an answer from NVIDIA:
"Our driver design, based on earlier assumptions according to
use/deployment cases at the time, packages all components together to
ensure integrity is retained as components evolve over the course of
driver development.
We are investigating the ability to enable modest compatibility across
versions, but the time horizon and breadth of that compatibility are
not known at this time.
We are also looking at how to improve the interoperability of CUDA
calls between driver versions—but again, this is a long-term effort.
One suggestion for the near-term was to install in such a way that
updated driver files are latched on next boot so that kernel- and
user- components can be changed on the file system in lock-step."

>From my point of view this is pretty much the answer I've expected.
They are committed to investigating a solution but IMHO this doesn't
necessarily mean that there will be a solution. Even if there will be
a solution we don't know how long it'll take NVIDIA to implement it
and if their solution will be feasible for us. For an instance if they
only promise compatibility on minor driver version updates then that
would still be problematic for us on major driver version updates.

That brings me to the question what is feasible on the Debian side
without making it even more a nightmare than it already is...
Maybe discussing it here isn't the best place though if the discussion
involves a lot of back and forth and options and so maybe this should
be discussed in an online document (Google Docs or similar).

Thoughts?
On Sat, Mar 31, 2018 at 3:21 PM Philipp Kern <pkern at debian.org> wrote:
>
> On 2018-03-30 20:02, Luca Boccassi wrote:
> > On Mon, 2018-03-26 at 18:45 +0200, Philipp Kern wrote:
> >> I would like to understand better what the current set of packages
> >> helps
> >> with, though. It is true that I hadn't considered that you are
> >> shipping
> >> so many packages right now. However, you seem to also hardcode the
> >> dependencies between them with a lot of substvars in the packaging,
> >> which is understandable given the non-free nature of them. But at the
> >> same time it makes it more muddy as to what problem that solves.
> > Well that's the Debian policy - one shared library per package, that's
> > what we follow.
>
> While this is technically true, they are also far from the regular
> shared library packages, too. People generally don't link against these
> shared libraries. Files are installed not into the regular directories.
> Most of the time newer libraries are not actually co-installable. The
> installed file doesn't necessarily follow the SONAME. (I only spot
> checked as I have spotty connectivity right now.)
>
> This is not about "you're doing it wrong or anything". Instead these are
> just awkward binary blobs that I think can be treated differently than
> usual shared libraries if needed. Especially in case you don't get the
> advantages of the split packaging with the binaries you are provided by
> NVidia.
>
> I'll try to come up with a longer answer to the remaining bits. I
> suppose we should play this through as an example with the current
> packaging and then check what's acceptable and what's not acceptable.
>
> > Yes, the legacy drivers (340xx and 304xx at the moment, although the
> > latter is out of support so I guess we'll drop it in buster) are co-
> > installable. There are update-alternatives for those too. We have a
> > script to make it easier to manage those and the glx provider (mesa,
> > fglrx, nvidia), it's update-glx from the update-glx package.
> >
> > You can find the scripts and configs in the git repo:
> > https://salsa.debian.org/nvidia-team/glx-alternatives
>
> This means that users are expected to call update-glx on bootup if the
> driver in the installation doesn't match the installed hardware, right?
> My hope would be that if we get it to work consistently for minor
> revisions that we can support legacy drivers with the same mechanism:
> When a legacy module is loadable, we make sure that the GLX bits point
> to the correct library version for the card installed. I know that in
> regular desktop systems card architecture changes are rare and users
> expect to tend to the machine manually in this case. However in the case
> of bigger pool setups and imaging, modern Linux and X.org just works,
> except the NVidia bits.
>
> Kind regards and thanks a lot for your responses!
> Philipp Kern
>
> --
> To unsubscribe, send mail to 889669-unsubscribe at bugs.debian.org.