Bug#1010599: "org.gnome.Shell at x11.service: Stop job pending for unit, delaying automatic restart." filling disk
Tomas Forsman
stric at cs.umu.se
Thu May 5 09:41:06 BST 2022
Package: gnome-shell-common
Version: 3.38.6-1~deb11u1
Severity: normal
Tags: patch
Dear Maintainer,
TL;DR: org.gnome.Shell at x11.service wants to restart after 0ms, but fails -
filling disk. Please increase to at least a few milliseconds.
*** Reporter, please consider answering these questions, where appropriate ***
* What led up to the situation?
I think the X server was killed, pulseaudio logged:
pulseaudio[350117]: X connection to :1 broken (explicit kill or server shutdown).
Then org.gnome.Shell at x11.service wanted to restart and RestartSec=0ms in
/lib/systemd/user/org.gnome.Shell at x11.service causes filled logs.
This happens "sometimes" in our computer labs (we have about 100
student computers).
* What exactly did you do (or not do) that was effective (or
ineffective)?
Don't know exactly what happened, but the result log spamming is a DoS.
* What was the outcome of this action?
This time - during the first ~2 minutes, the following log message was repeated
4 million times (half a gigabyte) before giving up:
2022-05-05T09:39:17+02:00 ochre systemd[350085]: org.gnome.Shell at x11.service: Stop job pending for unit, delaying automatic restart.
2022-05-05T09:39:17+02:00 ochre systemd[350085]: org.gnome.Shell at x11.service: Stop job pending for unit, delaying automatic restart.
... 4 million identical lines later ...
2022-05-05T09:41:29+02:00 ochre systemd[350085]: org.gnome.Shell at x11.service: Stop job pending for unit, delaying automatic restart.
2022-05-05T09:41:29+02:00 ochre systemd[350085]: org.gnome.Shell at x11.service: Stop job pending for unit, delaying automatic restart.
2022-05-05T09:41:29+02:00 ochre systemd[350085]: Looping too fast. Throttling execution a little.
2022-05-05T09:41:29+02:00 ochre systemd[1]: user@{user-id}.service: State 'stop-sigterm' timed out. Killing.
2022-05-05T09:41:29+02:00 ochre systemd[1]: user@{user-id}.service: Killing process 350085 (systemd) with signal SIGKILL.
2022-05-05T09:41:29+02:00 ochre systemd[1]: user@{user-id}.service: Killing process 350506 (gsd-smartcard) with signal SIGKILL.
... more processes killed ...
/lib/systemd/user/org.gnome.Shell at x11.service has:
# Do not wait before restarting the shell
RestartSec=10ms
# Kill any stubborn child processes after this long
TimeoutStopSec=5
My guess is that it took 2 minutes to give up instead of 5 seconds, because it
was busy spamming these log messages.
* What outcome did you expect instead?
"A few" log messages before giving up.
Attached is a patch to change from 0ms to 10ms, which should give no
user-noticable slowdown but would in this case reduce the amount of log
messages from 4 million to either about 13k (100/s for just over 2 minutes) or
to 500 (100/s for 5s if it gives up when it should).
The patch does not fix the original problem, but should reduce problems caused by it.
-- System Information:
Debian Release: 11.3
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 5.10.0-14-amd64 (SMP w/12 CPU threads)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_CPU_OUT_OF_SPEC, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/bash
Init: systemd (via /run/systemd/system)
Versions of packages gnome-shell-common depends on:
ii dconf-gsettings-backend [gsettings-backend] 0.38.0-2
gnome-shell-common recommends no packages.
gnome-shell-common suggests no packages.
-- no debconf information
-------------- next part --------------
--- org.gnome.Shell at x11.service-orig 2022-05-05 10:24:01.996927942 +0200
+++ org.gnome.Shell at x11.service 2022-05-05 10:24:21.300877802 +0200
@@ -34,6 +34,6 @@
# On X11 we want to restart on-success (Alt+F2 + r) and on-failure.
Restart=always
# Do not wait before restarting the shell
-RestartSec=0ms
+RestartSec=10ms
# Kill any stubborn child processes after this long
TimeoutStopSec=5
More information about the pkg-gnome-maintainers
mailing list