Bug#770135: Possible cause?
meijersebastiaan at gmail.com
Fri Sep 9 22:00:16 BST 2016
Okay, CentOS user here, please don't kill me right away.
I believe I'm experiencing this same bug on CentOS 7 and, after months of
bashing my head into the wall while in the meantime switching servers,
server providers and even switching from bare metal to virtualized, I think
I've found the cause, as well as a workaround.
First off, at least on my systems, the issue seems to occur on servers with
lots of users and/or logins/login attempts.
In my case I have about 20 users running 2 cron jobs each every minute,
resulting in ~40 simultaneous logins every minute.
I heard from someone else that they experienced the same problem when
receiving many external (unsuccessful?) login attempts.
The bug seems to get triggered shortly after systemd updates for me.
When it happens, it seems systemd-logind ends up in a restart loop, causing
it to slow down logins and consume 100% CPU while initializing.
It used to be sufficient to restart a load of services like mentioned
earlier in this thread, but eventually this stopped working too.
Looking deeper into the logs, this line stands out:
systemd-logind.service watchdog timeout (limit 1min)!
It seems like the restart loops are caused by the Systemd watchdog killing
systemd-logind because it takes too long.
I've increased the WatchdogSec limit in the systemd-logind.service file to
5 minutes and reloaded/restarted systemd-logind. It consumed 100% CPU for
slightly over a minute and then started behaving as it should again.
Not sure why it would take over a minute to start, but at least for me this
allows me to work around the issue.
I don't know for sure if this is the same issue as on Debian, but as this
thread seems to have come to a standstill with no fix since 2014, and as it
helped me pinpoint the issue, I thought I'd share my findings here.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Pkg-systemd-maintainers