Bug#788303: State of the bug?

Thu Mar 24 21:47:36 GMT 2016

Hi,

> I am experiencing this bug on 20 low memory Jessie VMs. But only in 20%
> of rebbot attempts. Adding memory might perhaps solve the problem, but
> also rises budget, and can I be sure? Its a real problem anyway.
> 
> It took me some time to find out this might be a bug at all, because it
> happens only at some of the reboots. Further I searched for other
> reasons, until I found this bugreport and the corresponding log entry:
> 
> Mar 23 17:21:22 xxxxx swapoff[4187]: swapoff:
> /dev/disk/by-uuid/38cd3812-d2ae-4d79-971a-1ed9b8b08505: swapoff failed:
> Nicht genügend Hauptspeicher verfügbar
> 
> at shutdown time.
> 
> Thanks for the reporting and analysis so far!
> 
> @Frank Heckenbach: I would love to give the "workaround" a try, but I
> don't fully understand, what will be achieved by
> 
> > ExecStop=/bin/sh -c 'while mount | grep -q "on /tmp "; do umount /tmp;
> sleep 1; done; while ! swapoff -a; do sleep 1; done'
> 
> in systemctl shutdown phase. This is only working with /tmp as mount or
> am I wrong? I don't think this can work for me.

Yes, only for /tmp. That's why it's only a workaround. ;)

It works for me (tm), because /tmp is my only tmpfs that ever
carries significant amounts of data. (Of course, there's /run etc.,
but there's not much in it.)

In those cases where I had problems shutting down, swapoff failed
because /tmp used too much space, more than would fit into RAM
without swap. So my workaround was to umount /tmp to free this
space, then swapoff would work.

Your situation may be different. You could try to find out what uses
the memory. There should not be many processes left running at this
point, unless you explicitly prevent them from stopping (or there
are other bugs ...). If you have another large tmpfs, you can try
umounting that instead (or in addition). If that's not it, you'll
need to find out what it is ...

In the worst case (before I'd waste money on more RAM just for
shutting down, which seems quite ridiculous ;), there's a real
hackish solution:

You can forcibly shut down a system with the "magic sysrq key",
using S(ync), U(mount), O(ff). You can also script this using
/proc/sys/kernel/sysrq and /proc/sysrq-trigger. You can google for
that stuff. If necessary, ask me again.

But that should really be a last resort. It will completely bypass
systemd (of course, it hasn't deserved better, if it causes us such
problems ... ;)

> Is there a general solution on the way or in work?

They said they fixed it upstream, but it doesn't look like it will
be backported to jessie, and therefore I haven't tried it. So from
my point of view, I can only hope it will be fixed in the next
stable release. Until then, I hope my workaround will work for me.

Regards,
Frank