Bug#806291: systemd: systemctl status ignores -n argument
Zbigniew Jędrzejewski-Szmek
zbyszek at in.waw.pl
Sun Dec 20 15:16:47 GMT 2015
On Sun, Dec 20, 2015 at 02:52:49PM +0100, Eduard Bloch wrote:
> Hallo,
> * Zbigniew Jędrzejewski-Szmek [Mon, Nov 30 2015, 01:25:49AM]:
>
> > > [Service]
> > > Type=notify
> > > Restart=on-failure
> > > ExecStart=/bin/sh -c "echo OMG THEY KILLED KENNY >&2; exit 123"
> > >
> > > and check the result:
> > >
> > > foo.service - SomeThing
> > > Loaded: loaded (/lib/systemd/system/foo.service; disabled; vendor preset: enabled)
> > > Active: failed (Result: start-limit) since Sa 2015-11-28 11:28:39 CET; 13s ago
> > > Process: 6982 ExecStart=/bin/sh -c echo OMG THEY KILLED KENNY >&2; exit 123 (code=exited, status=123)
> > > Main PID: 6982 (code=exited, status=123)
> > >
> > > Nov 28 11:28:39 idefix systemd[1]: Failed to start SomeThing.
> > > Nov 28 11:28:39 idefix systemd[1]: foo.service: Unit entered failed state.
> > > Nov 28 11:28:39 idefix systemd[1]: foo.service: Failed with result 'exit-code'.
> > > Nov 28 11:28:39 idefix sh[6982]: OMG THEY KILLED KENNY
> > > Nov 28 11:28:39 idefix systemd[1]: foo.service: Service hold-off time over, scheduling restart.
> > > Nov 28 11:28:39 idefix systemd[1]: Stopped SomeThing.
> > > Nov 28 11:28:39 idefix systemd[1]: foo.service: Start request repeated too quickly.
> > > Nov 28 11:28:39 idefix systemd[1]: Failed to start SomeThing.
> > > Nov 28 11:28:39 idefix systemd[1]: foo.service: Unit entered failed state.
> > > Nov 28 11:28:39 idefix systemd[1]: foo.service: Failed with result 'start-limit'.
> > >
> > >
> > > Looks good, huh? So what is the major difference to the script in the original report?
> > > Is it maybe this?
> > > Type=notify
> >
> > There's a race for short-lived programs - if the process has exited by
> > the time journald looks the sender up in /proc, it will not be able to
> > assign the message the right systemd unit. This means that the message
> > will be in logs, but it will not be shown by
> > 'systemctl status unit'/'journalctl -u unit'. This race is currently
> > unfixable, until kernel provides additional functionality to attach more
> > metadata to messages.
>
> OMG. I was assuming that sd_notify is a mechanism designed to avoid such
> kind of BS. Is there a public kernel bug report anywhere to check for
> more details?
https://bugzilla.redhat.com/show_bug.cgi?id=963620, links to lkml discussions
start at comment 33.
kdbus would also solve this, you can easily search for that.
> Can I flush the log processing somehow from inside the unit process
> itself?
Not that I'm aware of.
> > > But anyhow, it's a secondary issue and probably deserves a second bug report.
> > > The thing that botters me, see subject, is the broken -n option, adding -n20 or
> > > -n2000 to systemctl-status call does not change anything. It keeps displaying 10
> > > lines.
> > That's really strange.
> > What is the exact command line that you are using?
> > Does 'systemctl -n0 status ...' show any log lines?
>
> Basically any command. Let's pick the next example with useful logs from
> "journalctl -b0" output...
>
> $ systemctl status virtualbox
> ● virtualbox.service - LSB: VirtualBox Linux kernel module
> Loaded: loaded (/etc/init.d/virtualbox; bad; vendor preset: enabled)
> Active: failed (Result: exit-code) since Sa 2015-12-19 21:11:20 CET; 17h ago
> Docs: man:systemd-sysv-generator(8)
> Process: 1312 ExecStart=/etc/init.d/virtualbox start (code=exited, status=1/FAILURE)
> $ systemctl status -n200 virtualbox
> ● virtualbox.service - LSB: VirtualBox Linux kernel module
> Loaded: loaded (/etc/init.d/virtualbox; bad; vendor preset: enabled)
> Active: failed (Result: exit-code) since Sa 2015-12-19 21:11:20 CET; 17h ago
> Docs: man:systemd-sysv-generator(8)
> Process: 1312 ExecStart=/etc/init.d/virtualbox start (code=exited, status=1/FAILURE)
> $ journalctl -b0 -u virtualbox
> Hint: You are currently not seeing messages from other users and the system.
> Users in the 'systemd-journal' group can see all messages. Pass -q to
> turn off this notice.
> No journal files were opened due to insufficient permissions.
>
> (same thing as root)
> $ journalctl -b0 -u virtualbox
> -- Logs begin at Mo 2015-07-27 19:58:37 CEST, end at So 2015-12-20 14:46:29 CET. --
> Dez 19 21:11:19 zombie systemd[1]: Starting LSB: VirtualBox Linux kernel module...
> Dez 19 21:11:20 zombie virtualbox[1312]: Starting VirtualBox kernel modulesNo suitable module for running kernel found ... failed!
> Dez 19 21:11:20 zombie virtualbox[1312]: failed!
> Dez 19 21:11:20 zombie systemd[1]: virtualbox.service: Control process exited, code=exited status=1
> Dez 19 21:11:20 zombie systemd[1]: Failed to start LSB: VirtualBox Linux kernel module.
> Dez 19 21:11:20 zombie systemd[1]: virtualbox.service: Unit entered failed state.
> Dez 19 21:11:20 zombie systemd[1]: virtualbox.service: Failed with result 'exit-code'.
>
> Ok, only NOW we see the cause. The only chance to get the information
> about the problem is to become root and that information explicitly from
> the system journal. And -n was not effective in this context.
But this is a different than what you were describing. You said that you always
get 10 lines, and -n is ignored. But getting 0 lines is a different issue.
It seems that for some reason you don't have permissions to access the right
files. Please check which files that are (when running as root), and then
see why you cannot read them: is your use in the system-journal group, are
the files in /run/log or in /var/log, are the acls set on the journal files, etc.
Zbyszek
More information about the Pkg-systemd-maintainers
mailing list