[Pkg-utopia-maintainers] Bug#994096: /var/lib/dbus/machine-id breaks reproducible builds of OS images

Simon McVittie smcv at debian.org
Sat Sep 11 21:20:45 BST 2021


Control: severity -1 wishlist
Control: retitle -1 /var/lib/dbus/machine-id breaks reproducible builds of OS images

Retitling because this does not affect reproducibility *of packages*
(as recommended in Policy §4.15), only reproducibility of whole systems
(chroot/container/image).

On Sun, 12 Sep 2021 at 03:04:27 +1000, Trent W. Buck wrote:
> I am not sure how to fix this.
> The references to machine-id in the dbus sources confused me.
> It seems like sometimes it's a link, sometimes it's a symlink, sometimes it's a copy.

The problem here is that traditionally, merely installing the dbus
package - without requiring a reboot or a specific init system - has
been sufficient to get a fully-working D-Bus installation. One of the
properties provided by a fully-working D-Bus installation is that there is
a machine ID (in particular, dbus-launch(1) in the dbus-x11 package
will not work otherwise, but in general, the authors of dbus consider a
missing machine ID to be an incorrect and unsupported installation).

The system bus starts as uid 0 and is able to set up a machine ID for
itself, but the session bus and arbitrary user-defined buses (such
as the one used for AT-SPI) are unprivileged and cannot generate a
machine ID, so they have to rely on something "larger" (like systemd,
or /etc/init.d/dbus, or dbus.postinst) to do that setup.

If I was designing a message-bus system today, I wouldn't include a
machine ID in it; but I don't get to choose the API guarantees that
I've inherited from the original designers of D-Bus, and backwards
compatibility is important to me. I can see why the machine ID was
included, because it's there as a machine-oriented replacement for
the hostname, which has two properties that make it undesirable: it's
non-unique (lots of machines think their name is "debian" or "ubuntu" or
"localhost"), breaking the desirable property that same hostname implies
same machine; and it's human-meaningful, which means sysadmins sometimes
want to change it for cosmetic or administrative reasons, breaking the
desirable property that different hostname means different machine.

Back when D-Bus was designed, NFS-shared home directories and remote
X11 were considered to be essential-to-support, such that D-Bus would
not have been adopted if it did not cope with those; but that means it
needs a reliable way to identify machines among the multiple that can
be sharing a home directory or an X11 display (and no, the hostname is
not enough, for the reasons I mentioned above). Those use-cases are a lot
less important now, and could perhaps even be considered to be deprecated,
but the feature that was necessary to support them remains.

/etc/machine-id is a generalization of the D-Bus machine ID, originating
in systemd. There would be nothing to stop non-systemd machines from
implementing it, but there is a tendency for people who dislike systemd
to reject anything that came from systemd and work against its wider
adoption unless there is absolutely no alternative, so it is not
considered mandatory for Debian systems in general.

As a result, /etc/machine-id is not guaranteed to exist unless/until
the system has been booted successfully with systemd. If the system is
to be used as a "plain" chroot, or a container that will be run without
a full init system (as is conventional with Docker), or a machine that
will boot with sysvinit or some other non-systemd init system, then
there will usually be no /etc/machine-id.

*If* the system is always going to be booted with systemd (and
in particular for systemd-based live-images), then it's safe for
/var/lib/dbus/machine-id to be deleted or replaced with a symlink to
/etc/machine-id; but the dbus package's postinst cannot know whether
this is the case. Even if systemd-sysv happens to be installed already,
that's no guarantee that the system will not be used as a chroot with
no real init system, in which case systemd will be present but dormant,
and nothing will create /etc/machine-id.

If /var/lib/dbus/machine-id is deleted, on systems that boot with systemd,
the tmpfiles snippet /usr/lib/tmpfiles.d/dbus.conf will replace it with a
symlink to /etc/machine-id; or on systems that boot with sysvinit,
a call to dbus-uuidgen in /etc/init.d/dbus will regenerate it. However,
this will not generally happen on non-systemd machines.

When /var/lib/dbus/machine-id is generated by dbus-uuidgen in the dbus
postinst or in /etc/init.d/dbus, if /etc/machine-id exists, dbus-uuidgen
will copy it. In this case it is a copy, not a symlink, because dbus
cannot guarantee that the file /etc/machine-id (which is not conceptually
"owned" by dbus) will not get deleted out from under us.

dbus could in principle create /etc/machine-id instead of
/var/lib/dbus/machine-id, and make /var/lib/dbus/machine-id a symlink to
it, but, again, dbus does not conceptually "own" /etc/machine-id, so this
would create a risk that /etc/machine-id will be deleted by some other
component, breaking the guarantees that dbus aims to provide.

If you want mmdebstrap to generate reproducible images, then I think
the best analogue to "echo uninitialized > /etc/machine-id" would be
to delete /var/lib/dbus/machine-id, allowing it to be re-created during
next boot. However, if there is no such thing as the "next boot" because
the tree being bootstrapped is a chroot or a non-init-system container,
that will result in an incomplete and partially non-functional D-Bus
installation. Knowing whether this is an acceptable tradeoff requires more
context than either dbus.postinst or mmdebstrap has available to them.

I think the best solution to this might be to make /etc/machine-id
part of the "specification" for what makes a Debian system, as an
init-system-independent "API", similar to how we handle /run and
/usr/lib/os-release, but I suspect that people who dislike systemd would
oppose that as a point of principle, and I have higher priorities for how
I want to spend the necessary emotional energy to make contentious things
happen in Debian.

> AFAICT there's no mention of dbus's machine-id supporting "uninitialized".

That's because it doesn't. The special keyword "uninitialized" is a
recently-added feature of systemd's handling of /etc/machine-id, which
is newer than the D-Bus machine ID.

*If* the system is booted with systemd, then /etc/machine-id will
be replaced with a real machine ID during early boot, before the
system has booted up far enough for D-Bus components to be running, so
"uninitialized" will never be observable by D-Bus components. However,
if the system is booted with a different init system, or if it is a
chroot/container that is never "booted" at all, then that will not happen.

    smcv



More information about the Pkg-utopia-maintainers mailing list