Bug#805623: systemd: weird behaviour using 'Restart=on-watchdog'
Arturo Borrero Gonzalez
arturo.borrero.glez at gmail.com
Fri Nov 20 11:39:31 GMT 2015
Package: systemd
Version: 227-2
Severity: normal
Dear systemd maintainers,
thanks for your work with this package, it's really appreciated.
I've noticed some inconsistend behaviour when a service is using
Restart=on-watchdog.
TL;DR: systemd restart the daemon but no more than 3 times.
The behaviour described below is unexpected and I was unable to find
any documentation about it.
====== 8< ======
[Unit]
Description=Conntrack Daemon
[Service]
Type=notify
NotifyAccess=all
EnvironmentFile=-/usr/share/conntrackd/environmentfile
EnvironmentFile=-/etc/default/conntrackd
ExecStartPre=-/bin/rm -f /var/lock/conntrackd.lock
ExecStart=/usr/sbin/conntrackd -d -C "$CONFIG" $OPTIONS
ProtectSystem=full
ProtectHome=true
WatchdogSec=10
Restart=on-watchdog
[Install]
WantedBy=multi-user.target
====== 8< ======
This is how I trigger this situation.
Get more verbose information:
% sudo systemd-analyze set-log-level debug
I start the service by:
% sudo systemctl start conntrackd.service
At some point, I will manually stop the daemon:
% sudo kill -SIGSTOP <pid>
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Trying to
enqueue job conntrackd.service/start/replace
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Installed new
job conntrackd.service/start as 2065
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Enqueued job
conntrackd.service/start as 2065
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: About to
execute: /bin/rm -f /var/lock/conntrackd.lock
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Forked /bin/rm as 7847
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Changed failed
-> start-pre
nov 20 12:15:12 cfdev02 systemd[1]: Starting Conntrack Daemon...
nov 20 12:15:12 cfdev02 systemd[7847]: conntrackd.service: Executing:
/bin/rm -f /var/lock/conntrackd.lock
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Child 7847
belongs to conntrackd.service
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Control
process exited, code=exited status=0
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Got final
SIGCHLD for state start-pre.
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: About to
execute: /usr/sbin/conntrackd -d -C $CONFIG $OPTIONS
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Forked
/usr/sbin/conntrackd as 7850
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Changed
start-pre -> start
nov 20 12:15:12 cfdev02 systemd[7850]: conntrackd.service: Executing:
/usr/sbin/conntrackd -d -C /etc/conntrackd/conntrackd.conf
nov 20 12:15:12 cfdev02 conntrack-tools[7850]: disabling internal cache
nov 20 12:15:12 cfdev02 conntrack-tools[7850]: disabling external cache
nov 20 12:15:12 cfdev02 conntrack-tools[7850]: using kernel-space
event filtering
nov 20 12:15:12 cfdev02 conntrack-tools[7850]: netlink event socket
buffer size has been set to 8000000 bytes
nov 20 12:15:12 cfdev02 conntrack-tools[7850]: initialization completed
nov 20 12:15:12 cfdev02 conntrack-tools[7875]: -- starting in daemon mode --
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7850 (MAINPID=7875)
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Supervising
process 7875 which is not our child. We'll most likely not notice when
it exits.
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7875 (READY=1)
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Changed start -> running
nov 20 12:15:12 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/start finished, result=done
nov 20 12:15:12 cfdev02 systemd[1]: Started Conntrack Daemon.
nov 20 12:15:17 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7875 (WATCHDOG=1)
nov 20 12:15:19 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7875 (WATCHDOG=1)
nov 20 12:15:24 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7875 (WATCHDOG=1)
[ 1) send SIGSTOP to the daemon ]
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Watchdog
timeout (limit 10s)!
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Changed
running -> failed
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Unit entered
failed state.
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Failed with
result 'watchdog'.
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Changed failed
-> auto-restart
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: cgroup is empty
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Service
hold-off time over, scheduling restart.
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Trying to
enqueue job conntrackd.service/restart/fail
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Installed new
job conntrackd.service/restart as 2172
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Enqueued job
conntrackd.service/restart as 2172
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Scheduled restart job.
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Changed
auto-restart -> dead
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/restart finished, result=done
nov 20 12:15:34 cfdev02 systemd[1]: Stopped Conntrack Daemon.
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Converting job
conntrackd.service/restart -> conntrackd.service/start
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: About to
execute: /bin/rm -f /var/lock/conntrackd.lock
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Forked /bin/rm as 7931
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Changed dead
-> start-pre
nov 20 12:15:34 cfdev02 systemd[1]: Starting Conntrack Daemon...
nov 20 12:15:34 cfdev02 systemd[7931]: conntrackd.service: Executing:
/bin/rm -f /var/lock/conntrackd.lock
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Child 7931
belongs to conntrackd.service
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Control
process exited, code=exited status=0
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Got final
SIGCHLD for state start-pre.
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: About to
execute: /usr/sbin/conntrackd -d -C $CONFIG $OPTIONS
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Forked
/usr/sbin/conntrackd as 7933
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Changed
start-pre -> start
nov 20 12:15:34 cfdev02 systemd[7933]: conntrackd.service: Executing:
/usr/sbin/conntrackd -d -C /etc/conntrackd/conntrackd.conf
nov 20 12:15:34 cfdev02 conntrack-tools[7933]: disabling internal cache
nov 20 12:15:34 cfdev02 conntrack-tools[7933]: disabling external cache
nov 20 12:15:34 cfdev02 conntrack-tools[7933]: using kernel-space
event filtering
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7933 (MAINPID=7959)
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Supervising
process 7959 which is not our child. We'll most likely not notice when
it exits.
nov 20 12:15:34 cfdev02 conntrack-tools[7959]: -- starting in daemon mode --
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7959 (READY=1)
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Changed start -> running
nov 20 12:15:34 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/start finished, result=done
nov 20 12:15:34 cfdev02 systemd[1]: Started Conntrack Daemon.
nov 20 12:15:38 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7959 (WATCHDOG=1)
nov 20 12:15:40 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7959 (WATCHDOG=1)
[ 2) send SIGSTOP to the daemon ]
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Watchdog
timeout (limit 10s)!
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Changed
running -> failed
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Unit entered
failed state.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Failed with
result 'watchdog'.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Changed failed
-> auto-restart
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: cgroup is empty
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Service
hold-off time over, scheduling restart.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Trying to
enqueue job conntrackd.service/restart/fail
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Installed new
job conntrackd.service/restart as 2279
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Enqueued job
conntrackd.service/restart as 2279
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Scheduled restart job.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Changed
auto-restart -> dead
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/restart finished, result=done
nov 20 12:15:50 cfdev02 systemd[1]: Stopped Conntrack Daemon.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Converting job
conntrackd.service/restart -> conntrackd.service/start
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: About to
execute: /bin/rm -f /var/lock/conntrackd.lock
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Forked /bin/rm as 7989
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Changed dead
-> start-pre
nov 20 12:15:50 cfdev02 systemd[1]: Starting Conntrack Daemon...
nov 20 12:15:50 cfdev02 systemd[7989]: conntrackd.service: Executing:
/bin/rm -f /var/lock/conntrackd.lock
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Child 7989
belongs to conntrackd.service
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Control
process exited, code=exited status=0
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Got final
SIGCHLD for state start-pre.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: About to
execute: /usr/sbin/conntrackd -d -C $CONFIG $OPTIONS
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Forked
/usr/sbin/conntrackd as 7991
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Changed
start-pre -> start
nov 20 12:15:50 cfdev02 systemd[7991]: conntrackd.service: Executing:
/usr/sbin/conntrackd -d -C /etc/conntrackd/conntrackd.conf
nov 20 12:15:50 cfdev02 conntrack-tools[7991]: disabling internal cache
nov 20 12:15:50 cfdev02 conntrack-tools[7991]: disabling external cache
nov 20 12:15:50 cfdev02 conntrack-tools[7991]: using kernel-space
event filtering
nov 20 12:15:50 cfdev02 conntrack-tools[7991]: netlink event socket
buffer size has been set to 8000000 bytes
nov 20 12:15:50 cfdev02 conntrack-tools[7991]: initialization completed
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 7991 (MAINPID=8017)
nov 20 12:15:50 cfdev02 conntrack-tools[8017]: -- starting in daemon mode --
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Supervising
process 8017 which is not our child. We'll most likely not notice when
it exits.
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8017 (READY=1)
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Changed start -> running
nov 20 12:15:50 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/start finished, result=done
nov 20 12:15:50 cfdev02 systemd[1]: Started Conntrack Daemon.
nov 20 12:15:55 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8017 (WATCHDOG=1)
nov 20 12:16:00 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8017 (WATCHDOG=1)
[ 3) send SIGSTOP ]
nov 20 12:16:10 cfdev02 systemd[1]: conntrackd.service: Watchdog
timeout (limit 10s)!
nov 20 12:16:10 cfdev02 systemd[1]: conntrackd.service: Changed
running -> failed
nov 20 12:16:10 cfdev02 systemd[1]: conntrackd.service: Unit entered
failed state.
nov 20 12:16:10 cfdev02 systemd[1]: conntrackd.service: Failed with
result 'watchdog'.
nov 20 12:16:10 cfdev02 systemd[1]: conntrackd.service: Changed failed
-> auto-restart
nov 20 12:16:10 cfdev02 systemd[1]: conntrackd.service: cgroup is empty
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Service
hold-off time over, scheduling restart.
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Trying to
enqueue job conntrackd.service/restart/fail
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Installed new
job conntrackd.service/restart as 2386
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Enqueued job
conntrackd.service/restart as 2386
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Scheduled restart job.
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Changed
auto-restart -> dead
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/restart finished, result=done
nov 20 12:16:11 cfdev02 systemd[1]: Stopped Conntrack Daemon.
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Converting job
conntrackd.service/restart -> conntrackd.service/start
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: About to
execute: /bin/rm -f /var/lock/conntrackd.lock
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Forked /bin/rm as 8047
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Changed dead
-> start-pre
nov 20 12:16:11 cfdev02 systemd[1]: Starting Conntrack Daemon...
nov 20 12:16:11 cfdev02 systemd[8047]: conntrackd.service: Executing:
/bin/rm -f /var/lock/conntrackd.lock
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Child 8047
belongs to conntrackd.service
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Control
process exited, code=exited status=0
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Got final
SIGCHLD for state start-pre.
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: About to
execute: /usr/sbin/conntrackd -d -C $CONFIG $OPTIONS
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Forked
/usr/sbin/conntrackd as 8049
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Changed
start-pre -> start
nov 20 12:16:11 cfdev02 systemd[8049]: conntrackd.service: Executing:
/usr/sbin/conntrackd -d -C /etc/conntrackd/conntrackd.conf
nov 20 12:16:11 cfdev02 conntrack-tools[8049]: disabling internal cache
nov 20 12:16:11 cfdev02 conntrack-tools[8049]: disabling external cache
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8049 (MAINPID=8075)
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8075 (READY=1)
nov 20 12:16:11 cfdev02 conntrack-tools[8075]: -- starting in daemon mode --
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Changed start -> running
nov 20 12:16:11 cfdev02 systemd[1]: conntrackd.service: Job
conntrackd.service/start finished, result=done
nov 20 12:16:11 cfdev02 systemd[1]: Started Conntrack Daemon.
nov 20 12:16:16 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8075 (WATCHDOG=1)
nov 20 12:16:21 cfdev02 systemd[1]: conntrackd.service: Got
notification message from PID 8075 (WATCHDOG=1)
[ 4) send SIGSTOP ]
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Watchdog
timeout (limit 10s)!
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Changed
running -> stop-sigabrt
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Child 8075
belongs to conntrackd.service
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Main process
exited, code=killed, status=6/ABRT
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Changed
stop-sigabrt -> failed
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Unit entered
failed state.
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: Failed with
result 'signal'.
nov 20 12:16:31 cfdev02 systemd[1]: conntrackd.service: cgroup is empty
-- Package-specific info:
-- System Information:
Debian Release: stretch/sid
APT prefers testing
APT policy: (500, 'testing')
Architecture: amd64 (x86_64)
Kernel: Linux 4.2.0-1-amd64 (SMP w/8 CPU cores)
Locale: LANG=es_ES.UTF-8, LC_CTYPE=es_ES.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages systemd depends on:
ii adduser 3.113+nmu3
ii libacl1 2.2.52-2
ii libapparmor1 2.10-2+b1
ii libaudit1 1:2.4.4-4
ii libblkid1 2.27.1-1
ii libc6 2.19-22
ii libcap2 1:2.24-12
ii libcap2-bin 1:2.24-12
ii libcryptsetup4 2:1.6.6-5
ii libgcrypt20 1.6.4-3
ii libkmod2 21-1
ii liblzma5 5.1.1alpha+20120614-2.1
ii libmount1 2.27.1-1
ii libpam0g 1.1.8-3.1
ii libseccomp2 2.2.3-2
ii libselinux1 2.3-2+b1
ii libsystemd0 227-2
ii mount 2.27.1-1
ii sysv-rc 2.88dsf-59.2
ii udev 227-2
ii util-linux 2.27.1-1
Versions of packages systemd recommends:
ii dbus 1.10.2-1
pn libpam-systemd <none>
Versions of packages systemd suggests:
pn systemd-container <none>
pn systemd-ui <none>
-- no debconf information
--
Arturo Borrero González
More information about the Pkg-systemd-maintainers
mailing list