Bug#918764: udev: "udevadm control --reload-rules" kills all processes except init
Michael Biebl
biebl at debian.org
Tue Jan 15 13:28:12 GMT 2019
Control: severity -1 important
I'm downgrading this to non-RC, as I'm not convinced this is actually a
bug in udev
Am 15.01.19 um 09:08 schrieb Axel Beckert:
> Control: found -1 239-15
>
> Hi Michael,
>
> Michael Biebl wrote:
>> Am 12.01.19 um 01:02 schrieb Axel Beckert:
>>> Control: reopen -1
>>> Control: found -1 240-3
>>
>>> *sigh* I'm sorry to say, but it just happened again with udev 240-3
>>> and kernel 4.20-1~exp1.
>>
>> Would be good to know, if it also happens with 239-15 or if it's caused
>> by some other update.
>
> I just downloaded udev and libudev1 239-15 from
> https://snapshot.debian.org/, installed it and immediately afterwards
> ran "udevadm control --reload-rules" and everything was gone again.
> (Was running under kernel 4.20-1~exp1.)
>
> After that I rebooted into 4.20-1~exp1 with the downgraded udev again,
> ran "udevadm control --reload-rules" directly after reboot and nothing
> (unexpected) happened.
>
> Then I started to write this mail, doing not much more than logging in
> on the text console (despite X was running) using zsh, running
> ssh-agent, ssh-add and then ssh via autossh to connect to a screen
> session on some other host to write the mail with mutt.
>
> After about 20 minutes of uptime I was about to send that mail and
> thought, I should just try running "udevadm control --reload-rules"
> once again: And it killed all processes again -- systemd-udevd has
> been started again, too:
>
> ~ # ps auxwww | fgrep udev
> root 10696 0.0 0.0 13152 3592 ? S 08:27 0:00 /lib/systemd/systemd-udevd
> root 11582 0.0 0.0 8144 892 tty1 R+ 08:33 0:00 grep -F --color=auto udev
> ~ # uptime
> 08:33:49 up 25 min, 2 users, load average: 0.07, 0.15, 0.34
>
> Unfortunately there is not much in the syslog (had to start rsyslog
> first again, too):
>
> Jan 15 08:37:16 c6 kernel: [ 1717.930890] printk: systemd-udevd: 159 output lines suppressed due to ratelimiting
>
> I though didn't get rsyslog to drop the rate-limiting and I'm a little
> bit in a hurry at the moment.
>
> Will report back later.
>
> I must admit that I also had one crash/process killing yesterday where
> I can't say what triggered it. aptitude just finished starting up in
> TUI mode (inside screen started via ssh from remote) and I was
> starting to browse through the package list while the connection
> suddenly was lost (likely due to a killed sshd).
This all sounds, like udevd is not the root cause of all this, tbh,
especially since you also reproduced it with 239.
I think "udevadm trigger" trigger something when then causes another
component of your system to act this way.
> Some other facts gathered recently:
>
> * With udev 239-15 the bootup lag is gone even without the "sleep 5".
>
> * The "sleep 5" helped on another box (EeePC 900A with sysvinit) where
> drivers weren't loaded anymore and had to be specified manually in
> /etc/modules until the "sleep 5" was added.
> * memtest86 and memtest86+ just show an empty screen. Will try again
> with grub's graphical mode disabled just to make sure the issue is
> not triggered by some memory fault. Question would be then why I
> could (within a reboot where it happened) reliably reproduce the
> issue again and again. Will report any findings on this front.
> * As alternative I ran memtester for one night, no issues found. (Not
> sure if it was able to test everything as the affected box has 64 GB
> of RAM.)
> * If the issue happens while using X, there's no chance to switch back
> to the text console with the getty to login again. The machine needs
> a hard reboot via reset or power button then.
>
>> Tbh, udevd or udevadm control --reload killing all processes, sounds
>> pretty strange.
>
> Definitely.
>
>> Can you also try to run udevd in debug mode to get a log from udevd (see
>> /etc/udev/udev.conf) and also an strace of the udevadm command.
>
> I think I alread sent the strace, but forgot the debug mode. Enabled
> that when starting to write this mail, but it's currently caught by
> rsyslogd's rate-limiting, see above.
It's the kernel, which does the rate limiting. Add
log_buf_len=1M printk.devkmsg=on
to the kernel command line to turn off the ratelimiting and increase the
ring buffer.
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20190115/15367b42/attachment.sig>
More information about the Pkg-systemd-maintainers
mailing list