Bug#840056: shibboleth-sp2-utils: upgrade attempt of shibboleth-sp2-utils gets hung at restart of shibd service
banerian at u.washington.edu
Tue Oct 11 18:03:10 UTC 2016
On 10/11/2016 03:22 AM, Ferenc Wágner wrote:
> "S. Banerian" <banerian at u.washington.edu> writes:
>> On 10/09/2016 05:25 PM, Ferenc Wágner wrote:
>>> "S. Banerian" <banerian at u.washington.edu> writes:
>>>> On 10/07/2016 02:04 PM, Ferenc Wágner wrote:
>>>>> Could you please make sure shibd isn't running
>>>>> then show me the output of
>>>>> # sudo -u _shibd strace shibd -f -F
>> after some 12 hours of trying to start, failing, it finally started,
>> created shibd.sock, and under a test, worked.
> Was this the doing of a single invocation of the above, or do you refer
> to systemd continuously trying to restart it and succeeding eventually?
this was systemd continually trying. i ensured no spurious shibd procs
>>> Can you provide a full GDB backtrace (after installing
>>> shibboleth-sp2-utils-dbgsym; please yell if you need precise
>> does not appear to be in stretch. so i need the instructions.
> It is in a separate archive, see
> https://wiki.debian.org/AutomaticDebugPackages. But let's exclude the
> simple timeout problem beforehand.
>>>> Note: prior to the upgrade, shibboleth was working.
>>> Which version of shibboleth was working for you?
>> the version just prior to this one 2.6.0+dfsg1-3+b1 on stretch.
> Do you mean 2.5.6+dfsg1-2? Your dpkg or apt logs should reveal the
> upgraded version.
>>> Can you share your shibboleth2.xml?
>> I'm a bit reluctant to provide some of the information in the
>> RequestMapper sections.
> If configuring a longer timeout (below) does not help, please check if
> you can reproduce the issue without the sensitive parts.
>> When I force a restart, systemctl restart shibd.service I get the issue
>> as before, where
>> \_ /bin/systemd-tty-ask-password-agent --watch
>> stays there for a looong time, and is not returning, systemctl says it
>> is started, but journalctl -xe gives:
>> Oct 10 14:00:35 epics systemd: shibd.service: Killing process 30980
>> (shibd) with signal SIGKILL.
>> Oct 10 14:00:35 epics systemd: shibd.service: Main process exited,
>> code=killed, status=9/KILL
>> Oct 10 14:00:35 epics systemd: Failed to start Shibboleth Service
>> Provider Daemon.
>> -- Subject: Unit shibd.service has failed
> This really does not make much sense together... And I can't see any
> systemd-tty-ask-password-agent processes at all for some reason.
we agree. no reason to be seeing this.
>> there is a shibd -f -F process running, but no shibd.sock file
> Are you sure that process isn't from some manual start attempt? Also,
> if you start an instance manually while systemd's still trying to
> occasionally restart shibd in the background, the socket may get lost.
> So, first of all, tell systemd to stop shibd and wait for it:
> # systemctl stop shibd
> Then you should see something like:
> # systemctl status shibd
> Active: inactive (dead) [...]
> Main PID: 360 (code=exited, status=0/SUCCESS)
> Oct 11 11:34:39 elm systemd: Stopped Shibboleth Service Provider Daemon.
actually, after doing that, I got:
systemctl status shibd.service
● shibd.service - Shibboleth Service Provider Daemon
Loaded: loaded (/lib/systemd/system/shibd.service; disabled; vendor
Active: inactive (dead)
Oct 11 10:35:18 epics systemd: Stopped Shibboleth Service Provider
Oct 11 10:35:18 epics systemd: Starting Shibboleth Service Provider
Oct 11 10:36:48 epics systemd: shibd.service: Start operation timed
Oct 11 10:36:54 epics systemd: shibd.service: State
'stop-final-sigterm' timed out. Killing.
Oct 11 10:36:54 epics systemd: shibd.service: Killing process 5523
(shibd) with signal SIGKILL.
Oct 11 10:36:54 epics systemd: shibd.service: Main process exited,
Oct 11 10:36:54 epics systemd: Failed to start Shibboleth Service
Oct 11 10:36:54 epics systemd: shibd.service: Unit entered failed state.
Oct 11 10:36:54 epics systemd: shibd.service: Failed with result
Oct 11 10:36:58 epics systemd: Stopped Shibboleth Service Provider
> Then start it manually:
> # date; sudo -u _shibd /usr/sbin/shibd -f -F
> Meanwhile check /var/log/shibboleth/shibd.log for progress; the
> timestamps should tell you where time was spent.
Did this, and after a while, it started.
>> I'm not convinced that systemd is behaving well.
> Maybe it is, just the default start timeut (90s) is too short for your
> metadata setup. Try setting it longer like:
> # mkdir /etc/systemd/system/shibd.service.d
> # printf '[Service]\nTimeoutStartSec=5min\n' >/etc/systemd/system/shibd.service.d/timeout.conf
> # systemctl daemon-reload
> # systemctl cat shibd
> [you should see the result at then end of output]
> Make sure to Ctrl-C your manually started shibd process if it's still
> running before starting the systemd shibd service.
>> with the attempt to perform
>> systemctl restart shibd.service
>> I'm now seeing the CPU at 100% and memory (but not yet swap) near 100% also.
>> and no shibd.sock.
> Yes, the startup phase of shibd can consume lots of resources (Dynamic
> MetadataProvider can help with this). And the default timeout changed
> from 5min to 1.5min in this upgrade, which might cause your problems.
adding the timeout.conf file, systemctl daemon-reload and then
systemctl start shibd
after approximately two minutes, the shibd process started.
I was able to use apache2 normally. I was able to
systemctl stop shibd and start it again normally, and after two
minutes or so, it was running.
I have been able to reproduce this now. Two minutes seems to be the
UW Clinical Cyclotron www.uwmcf.org
UW School of Medicine
UW Box 356043
gpg key 6642E7EE
fingerprint = BD13 875D 2D03 5E1D 1E3B 8BF7 F4B8 63AD 6642 E7EE
More information about the Pkg-shibboleth-devel