Bug#788303: systemd: Hangs indefinitely on >90% of reboot attempts
Frank Heckenbach
f.heckenbach at fh-soft.de
Wed Jan 6 05:42:26 GMT 2016
I did some more debugging, and found out some things:
- In a debug shell after the failed shutdown, I did:
systemctl status `systemctl | grep failed | grep swap | awk '{print $2}'`
and found an error message like this:
swapoff: /dev/sdxx: swapoff failed: Cannot allocate memory
- Manually running "swapoff -a" while my system is up, also often
fails with this error (especially if KDE and Firefox are running
and overcommit_memory is disabled), even though there seems to be
enough free memory. This is explained here (including the links in
the comments):
http://unix.stackexchange.com/questions/89514/swapoff-fails-when-overcommit-memory-2
This may be the root cause of our problems, not a systemd bug (but
don't close the issue yet, read on).
- I also wondered, why do swapoff at all before shutdown/restart?
This is apparently answered in:
https://bugzilla.redhat.com/show_bug.cgi?id=1031158
Though I don't really agree with that reasoning (breaking the
shutdown process for many people, as indicated in this, that, and
all the merged bug reports, just for the sake of a rather unlikely
IMHO niche case), that seems to be the way it is and it's not
strictly a bug, either.
- However, to answer the question asked there (I don't have an RH
account to post there, and the report there is closed already, so
I'll write it here; feel free to forward it to RH and/or systemd):
"What is using the memory in your case?"
At least in one case I debugged, it seems to be a tmpfs which
indeed had grown to several GB size (which I'd normally expect to
be discarded on shutdown, that's fine).
Apparently, systemd tries to swapoff before unmounting the tmpfs.
It does so, apparently, because tmpfs was mounted before swapon,
and shutdown order is reverse startup order.
So swapoff fails because tmpfs uses too much memory. Then tmpfs is
unmounted (successfully), but swapoff is never retried, shutdown
is considered failed, and the system is basically dead.
So AFAICS there are at least two issues to fix in systemd:
1. Do swapoff *after* umount.
I know this might be difficult given systemds concepts of
depedencies and ordering, but I don't care. I didn't ask for it,
it was systemd that introduced it (and Debian who throw systemd
on us), so it's your job to sort out the mess. If a proper
solution is not feasible easily, at least do an additional
swapoff after umounting.
2. Handle shutdown failures better (i.e., at all).
Currently, all you get is a message which (unlike in sysvinit) is
not even shown on the console, but only in its journal (which you
normally can't even read, because you have no shell at this
point, so the message might as well not exist at all).
Then the system just hangs (I wasn't as patient as Will Aoki and
didn't wait for 27 minutes, but I assume it will hang forever),
unless you take harder measures ...
That's not really acceptable behaviour. systemd must know that
shutdown has failed and nothing can progress anymore. If so, at
least give me a maintenance shell or something! And tell me about
the error (see above)!
(I know this might be difficult etc., see above.)
More information about the Pkg-systemd-maintainers
mailing list