Bug#896152: systemd/dbus hanging on stretch updates

Simon McVittie smcv at debian.org
Fri Apr 20 11:08:27 BST 2018


On Fri, 20 Apr 2018 at 11:21:38 +0200, Christoph Berg wrote:
> on stretch point release upgrades, I've repeatedly seem systemctl
> running in to timeouts/errors from various postinst scripts. It
> happened on different (similarly configured) machines, now on two
> machines, and if I remember correctly twice before on previous point
> releases.

Is it possible to take a snapshot (maybe as a VM) of a
similarly-configured machine, and reproduce this repeatedly by upgrading
from the snapshot?

> Preparing to unpack .../util-linux_2.29.2-1+deb9u1_amd64.deb ...
> Unpacking util-linux (2.29.2-1+deb9u1) over (2.29.2-1) ...
> Setting up util-linux (2.29.2-1+deb9u1) ...
> Failed to reload daemon: Connection timed out

This is before dbus/1.10.26-0+deb9u1 or systemd/232-25+deb9u3 were even
unpacked, so not a regression in those versions.

> There's nothing in the logs which looks like the cause, but maybe I
> haven't seen it among the plethora of messages. (I can provide logs if
> it helps.)

Searching the syslog or systemd Journal for messages involving dbus
(or systemd, or other things that use D-Bus) might be informative?

If you are able to reproduce this on other machines, it might be useful
to reboot the machine immediately before upgrading, so that everything
in the current boot's log is relevant to either boot or the upgrade.

If the machine is one that can be taken out of service, it might also
be useful to take down other services so that the only things happening
are to do with the upgrade.

Is the dbus-daemon responsive (before starting the upgrade) in the
sense that you can connect to it? One easy way to try this is to run
"dbus-monitor --system". If you see "dbus-monitor: unable to enable
new-style monitoring: org.freedesktop.DBus.Error.AccessDenied"
as non-root, or if you see it logging a NameAcquired message, then
everything is working as it should; use Ctrl+C to exit.

You might be able to get some useful information from commands like these
(as root):

ls -l /proc/$(pgrep -f "dbus-daemon --system")/fd
cat /proc/$(pgrep -f "dbus-daemon --system")/status
dbus-send --system --dest=org.freedesktop.DBus --print-reply /org/freedesktop/DBus org.freedesktop.DBus.Debug.Stats.GetStats
dbus-send --system --dest=org.freedesktop.DBus --print-reply /org/freedesktop/DBus org.freedesktop.DBus.Debug.Stats.GetConnectionStats string:org.freedesktop.systemd1

This would tell you whether there are excessively many messages or
file descriptors queued up, for instance.

> xymon      360  0.0  0.0      0     0 ?        Z    10:42   0:00 [sh] <defunct>
> sshd       470  0.0  0.0      0     0 ?        Z    10:44   0:00 [sshd] <defunct>

I wonder whether these zombies are relevant?

    smcv




More information about the Pkg-systemd-maintainers mailing list