Bug#788303: systemd: Hangs indefinitely on >90% of reboot attempts

Frank Heckenbach f.heckenbach at fh-soft.de
Wed Jan 6 05:42:26 GMT 2016


I did some more debugging, and found out some things:

- In a debug shell after the failed shutdown, I did:

  systemctl status `systemctl | grep failed | grep swap | awk '{print $2}'`

  and found an error message like this:

  swapoff: /dev/sdxx: swapoff failed: Cannot allocate memory

- Manually running "swapoff -a" while my system is up, also often
  fails with this error (especially if KDE and Firefox are running
  and overcommit_memory is disabled), even though there seems to be
  enough free memory. This is explained here (including the links in
  the comments):

  http://unix.stackexchange.com/questions/89514/swapoff-fails-when-overcommit-memory-2

  This may be the root cause of our problems, not a systemd bug (but
  don't close the issue yet, read on).

- I also wondered, why do swapoff at all before shutdown/restart?
  This is apparently answered in:

  https://bugzilla.redhat.com/show_bug.cgi?id=1031158

  Though I don't really agree with that reasoning (breaking the
  shutdown process for many people, as indicated in this, that, and
  all the merged bug reports, just for the sake of a rather unlikely
  IMHO niche case), that seems to be the way it is and it's not
  strictly a bug, either.

- However, to answer the question asked there (I don't have an RH
  account to post there, and the report there is closed already, so
  I'll write it here; feel free to forward it to RH and/or systemd):

  "What is using the memory in your case?"

  At least in one case I debugged, it seems to be a tmpfs which
  indeed had grown to several GB size (which I'd normally expect to
  be discarded on shutdown, that's fine).

  Apparently, systemd tries to swapoff before unmounting the tmpfs.
  It does so, apparently, because tmpfs was mounted before swapon,
  and shutdown order is reverse startup order.

  So swapoff fails because tmpfs uses too much memory. Then tmpfs is
  unmounted (successfully), but swapoff is never retried, shutdown
  is considered failed, and the system is basically dead.

So AFAICS there are at least two issues to fix in systemd:

1. Do swapoff *after* umount.

   I know this might be difficult given systemds concepts of
   depedencies and ordering, but I don't care. I didn't ask for it,
   it was systemd that introduced it (and Debian who throw systemd
   on us), so it's your job to sort out the mess. If a proper
   solution is not feasible easily, at least do an additional
   swapoff after umounting.

2. Handle shutdown failures better (i.e., at all).

   Currently, all you get is a message which (unlike in sysvinit) is
   not even shown on the console, but only in its journal (which you
   normally can't even read, because you have no shell at this
   point, so the message might as well not exist at all).

   Then the system just hangs (I wasn't as patient as Will Aoki and
   didn't wait for 27 minutes, but I assume it will hang forever),
   unless you take harder measures ...

   That's not really acceptable behaviour. systemd must know that
   shutdown has failed and nothing can progress anymore. If so, at
   least give me a maintenance shell or something! And tell me about
   the error (see above)!

   (I know this might be difficult etc., see above.)




More information about the Pkg-systemd-maintainers mailing list