Bug#918764: udev: "udevadm control --reload-rules" kills all processes except init

Axel Beckert abe at debian.org
Tue Jan 15 15:17:59 GMT 2019


Hi Michael,

Michael Biebl wrote:
> I'm downgrading this to non-RC, as I'm not convinced this is actually a
> bug in udev

Fair enough. (Actually I already thought about this, too.) While it
still fulfils the requirements for "critical" (as it crashes
everything except the kernel, i.e. the whole system), it also seems to
affect only a very small number of users (namely 1) as so far no other
user appeared and said "me too".

Btw., one thing, which seems to speak against a hardware issue, is,
that the kernel logs nothing relevant in dmesg and easily survives the
issue.

> This all sounds, like udevd is not the root cause of all this, tbh,
> especially since you also reproduced it with 239.

*nod*

> I think "udevadm trigger" trigger something when then causes another
> component of your system to act this way.

Yes, in the meanwhile this is my suspicion, too. The big question is:
What is it?

So if udev isn't the real cause, there's one other obvious package
which could be involved: sysvinit-core.

Let's see if there's any obvious relation between any of the sysvinit
uploads and the start of my workstation becoming unstable (sic!).

Looking at the logs from uptimed, it either started around 25th of
November 2018, 9th of December or 29th of December. While I can't
remember the exact reasons for the reboots before Christmas, this now
looks as if the machine became rather unstable more than a month ago.

[...]
    25    89 days, 01:57:45 | Linux 4.17.0-1-amd64      Mon Jul  9 21:42:02 2018
    26    48 days, 20:51:31 | Linux 4.18.0-2-amd64      Sat Oct  6 23:56:49 2018
    27     1 day , 00:02:04 | Linux 4.18.0-3-amd64      Sun Nov 25 01:34:26 2018
    28    12 days, 20:14:25 | Linux 4.18.0-3-amd64      Mon Nov 26 02:43:26 2018
    29     7 days, 19:01:13 | Linux 4.19.0-trunk-amd64  Sun Dec  9 00:04:33 2018
    30     6 days, 17:01:35 | Linux 4.18.0-3-amd64      Mon Dec 17 00:29:24 2018
    31     0 days, 02:01:22 | Linux 4.19.0-1-amd64      Sun Dec 23 18:13:43 2018
    32     6 days, 14:47:09 | Linux 4.18.0-3-amd64      Sun Dec 23 21:17:11 2018
    33     0 days, 22:25:36 | Linux 4.18.0-3-amd64      Wed Jan  9 07:35:56 2019
    34     1 day , 02:08:31 | Linux 4.19.0-1-amd64      Thu Jan 10 06:23:03 2019
    35     2 days, 07:23:41 | Linux 4.18.0-3-amd64      Sat Jan 12 04:09:30 2019
    36     0 days, 11:01:45 | Linux 4.20.0-trunk-amd64  Mon Jan 14 20:53:17 2019
->  37     0 days, 07:11:47 | Linux 4.20.0-trunk-amd64  Tue Jan 15 08:08:38 2019

Please note that the box didn't log any uptime between 29th of
December (23th of December + 6 days) and 9th of January (next boot).
29th of December is where I became aware of the issue (because the SSH
connection died inmidst of a dist-upgrade and from then on I just got
"connection refused" despite the machine pinged, because also SSHd
died) and 9th of December is where I was back home and started to dig
into the issue.

Now comparing with the upload times of sysvinit:

sysvinit (2.93-3) unstable; urgency=medium
 -- Dmitry Bogatov <KAction at debian.org>  Sat, 05 Jan 2019 11:21:53 +0000

sysvinit (2.93-2) unstable; urgency=medium
 -- Dmitry Bogatov <KAction at debian.org>  Thu, 27 Dec 2018 09:49:41 +0000

sysvinit (2.93-1) unstable; urgency=medium
 -- Dmitry Bogatov <KAction at debian.org>  Tue, 04 Dec 2018 04:23:18 +0000

sysvinit (2.92~beta-2) unstable; urgency=medium
 -- Dmitry Bogatov <KAction at debian.org>  Fri, 23 Nov 2018 16:45:40 +0000

sysvinit (2.92~beta-1) unstable; urgency=medium
 -- Dmitry Bogatov <KAction at debian.org>  Thu, 22 Nov 2018 16:13:55 +0000

sysvinit (2.91-1) experimental; urgency=medium
 -- Dmitry Bogatov <KAction at debian.org>  Thu, 15 Nov 2018 15:43:24 +0000

(IIRC I installed sysvinit 2.91-1 from Debian Experimental back then,
too.)

At least I don't see an obvious correlation to e.g. the new upstream
releases (or even uploads) of sysvinit.

Then again, this issue doesn't need to exactly relate to the upload or
install times of sysvinit (can get the exact upgrade times of sysvinit
or udev from the logs, if interested), but only appears if a package
maintainer script calls "udevadm control --reload-rules" like e.g.
fuse.

Anyway, I'm taking Dmitry into Cc since sysvinit-core's init is the
only process which survives this issue and hence might be involved.
(Dmitry: Please tell me if I should rather send this to the
mailing-list.)

I will probably also check if an earlier sysvinit version, like e.g.
2.88dsf-59.11 (as 2.88dsf-60 IIRC had some issues of its own), makes
the issue go away, just to be sure (like with udev 239-15).

> >> Can you also try to run udevd in debug mode to get a log from udevd (see
> >> /etc/udev/udev.conf) and also an strace of the udevadm command.
> > 
> > I think I alread sent the strace, but forgot the debug mode. Enabled
> > that when starting to write this mail, but it's currently caught by
> > rsyslogd's rate-limiting, see above.
> 
> It's the kernel, which does the rate limiting. Add
> log_buf_len=1M printk.devkmsg=on
> to the kernel command line to turn off the ratelimiting and increase the
> ring buffer.

Thanks for that hint! Added it to /etc/default/grub, but also set
kernel.printk.devkmsg=on and kernel.printk_ratelimit_burst=10000 for
the running kernel via sysctl. (The first alone didn't seem to
suffice.)

I've now also started a second sshd listening on a different port via
/etc/inittab so that the system stays accessible once this happens
again while I'm away or working under X. Just tried it, works well.
;-/

Anyway, now I've got the debug output you (Michael) wanted from dmesg.
Unfortunately there isn't much:

[25270.470683] systemd-udevd[31587]: udevd message (RELOAD) received
[25270.470711] systemd-udevd[31587]: Unload module index
[25270.470731] systemd-udevd[31587]: Unloaded link configuration context.

Based on the time stamps, any earlier dmesg log entry clearly stems
from my previous "service udev restart" to see if the ratelimiting was
indeed disabled.

P.S. to Michael: Thanks for all you effort so far as well as the hints
and suggestions on this.

		Regards, Axel
-- 
 ,''`.  |  Axel Beckert <abe at debian.org>, https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-    |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE



More information about the Pkg-systemd-maintainers mailing list