Bug#792761: UX issue, handling of endless shutdown loops

Eduard Bloch edi at gmx.de
Mon Jul 20 07:48:21 BST 2015


Hallo,
* Felipe Sateler [Sun, Jul 19 2015, 03:14:11PM]:
> Control: tags -1 moreinfo
> 
> Hi Eduard,
> 
> On 18 July 2015 at 05:24, Eduard Bloch <edi at gmx.de> wrote:
> > Package: systemd
> > Version: 222-1
> > Severity: normal
> >
> > Hello systemd maintainers,
> >
> > foreword for systemd hatters:
> > I want this problem to be fixed IN systemd and not by removing systemd.
> > Move along, those are not the droids you are looking for.
> >
> > There is a thing in systemd that bothers me (and has for a while) and it
> > looks like upstream is not moving to fix this. I say BUG and FIX because
> > this is not cosmetics (i.e. a minor user experience issue), the UX might
> > affect how the user acts in order to solve the problem and eventually
> > loose/damage his data because of this problem.
> >
> > What happens is, sometimes systemd gets into an endless loop at the
> > shutdown. Right when the devices are unmounted, in the most vulnerable
> > moment. And then this BS happens, see
> > https://www.unix-ag.uni-kl.de/~bloch/part2.m4v .
> >
> > Obvious questions and events so far (and during/after the video):
> >
> >  - what does it wait for?
> >  - why don't you let me see any useful detail of that running tasks?
> >  - why do I have to wait for 90s and then it tells me: HA HA, now you
> >    wait another two minutes.
> >  - And when I waited another two minutes, again, HA HA, wait until 4:36...
> >  - or maybe until "no limit" which is blinking again and again? Waiting
> >    forever? Are you kidding me?
> >  - ok, I was pissed now and pressed Ctrl-Alt-Del, expecting it to do
> >    something useful. Now it showed me some
> >    Stopped... messages and then immediately three "Starting..."
> >    messages. And huh... Starting? Starting something? I am trying to
> >    shutdown, why do you restart some sh.. instead?
> >  - Now I have enough of that sh... and use Magic-SysRQ sequence to sync
> >    and reboot.
> >
> > I see some room for improvement:
> >  - give the user usable information in this case!
> >  - AND/OR tell the user how to retrieve more information. Maybe there is
> >    some "secret" shortcut to dump information (I haven't checked the
> >    docs yet but I expect upstream to be sane enough to have implemented
> >    a such thing) but that infomration needs to be revealed NOW. Having
> >    it in some wiki on the internet does not help.
> >  - give the user a way to interrupt this. I guess it's either a systemd
> >    bug (closed depedency loop?) or one of the outstanding tasks is
> >    blocked for some reason (might be a kernel driver issue with
> >    devmapper) but in any case, I want to be able to investigate and
> >    apply the most harmless fix (kill or ignore the hanging task). Right
> >    now I just feel stupid and it's not my fault.
> 
> So, we have 2 issues:
> 
> 1. Your system is not shutting down
> 2. Systemd is not telling you enough to discover what is wrong.
> 
> I'm afraid 2 is really an upstream issue and not an integration issue.
> Could you please file that bug upstream?

Maybe... I will give it a try tonight. However I am sceptical regarding
"productive" communication with upstream.

> We can try to help debug number 1, though, and there may be a real
> integration bug. Please see the shutdown section in the upstream
> wiki[1]; in particular, starting the debug shell before shutting down

The thing is, the issue is not reproducible. It might happen once a
month, and then not appear for many months. And you don't want to enable
all the debug machinery all the just in case. The thing needed is the
minimum control mechanism in place just when the problem occurs.

And when it happens for the first time, it's too late to active the
debug shell.

> and switching to it when the shutdown hangs should let you query the
> journal to find out issues.
> 
> Note that if you enabled persistent journal logging you can maybe
> still get the info from the journal: journalctl -b -<number of boots
> since the last hang>

Says that it cannot find an id. Checked:

$ sudo journalctl --list-boots
-1 39f59f8ebdd644f39aeb46b67eef9bff Sa 2015-07-18 09:12:05 CEST—Mo 2015-07-20 08
 0 39f59f8ebdd644f39aeb46b67eef9bff Mo 2015-07-20 08:39:43 CEST—Mo 2015-07-20 08

No idea what happened to the logs.

Regards,
Eduard.

> [1] http://freedesktop.org/wiki/Software/systemd/Debugging/
> 
> 
> -- 
> 
> Saludos,
> Felipe Sateler
> 

-- 
Die letzte Stimme, die man hört, bevor die Welt explodiert, wird die
Stimme eines Experten sein, der sagt: 'Das ist technisch unmöglich!'
		-- Sir Peter Ustinov




More information about the Pkg-systemd-maintainers mailing list