Bug#813879: systemd: Assertion 's->exec_command[SERVICE_EXEC_START]' failed service_enter_start()

Yuriy M. Kaminskiy yumkam at gmail.com
Mon Feb 8 15:18:20 GMT 2016


On 08.02.2016 02:15, Yuriy M. Kaminskiy wrote:
>> Package: systemd
>> Version: 215-17+deb8u3
>> Severity: important
>>
> Probably related:
> cron-update.service is triggered by some /etc/cron* directories change 
> and invokes `systemctl daemon-reload` and `systemctl try-restart 
> cron.target`. Maybe there are some racing when it is triggered right 
> when cron.target is being stopped?
>
> Probably related upstream commit: 
> 96fb8242cc1ef6b0e28f6c86a4f57950095dd7f1
> (aka v216-30-g96fb824), however, it likely fixes symptoms [assert() 
> and abort], but not underlying issue [racing or whatever].

I've looked at core file, after musing a bit upon sources, I don't think 
this commit will fix/hide issue.

Backtrace:

#6  0x00007f08e081124f in service_enter_start (s=s at entry=0x7f08e21c7a10)
     at ../src/core/service.c:1312
#7  0x00007f08e0813341 in service_sigchld_event.lto_priv.377 
(u=0x7f08e21c7a10,
     pid=<optimized out>, code=<optimized out>, status=0)
     at ../src/core/service.c:2338
#8  0x00007f08e084b887 in manager_dispatch_sigchld (m=0x7f08e20fc350)
     at ../src/core/manager.c:1639

(gdb) p s->type
$14 = _SERVICE_TYPE_INVALID
(gdb) p s->state
$15 = SERVICE_START_PRE
(gdb)  p s->meta.load_state
$16 = UNIT_NOT_FOUND
(gdb) p s->exec_command
$18 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0}

Problem is, we started executing unit, spawned StartPre command, then 
unit file was removed, systemctl daemon-reload was issued, unit 
structure become half-ghost, then we got SIGCHLD for that StartPre 
command from the already-removed unit. Oops.

With 96fb824 applied, end result would be same:

@@ -1332,6 +1345,12 @@ static void service_enter_start(Service *s) {
                  c = s->main_command = s->exec_command[SERVICE_EXEC_START];
          }

+        if (!c) {
+                assert(s->type == SERVICE_ONESHOT);
+                service_enter_start_post(s);
+                return;
+        }
+

c is NULL, s->type here is _SERVICE_TYPE_INVALID, so we'll die in assert 
anyway :-\

It is possible that upstream systemd version is still affected, you may 
want to try install jessie's systemd-cron 1.3.* into sid and play with 
install/removal in a loop.
Completely untested patches for systemd master and backport to v215 is 
attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Fix-sigchld-handling-for-invalid-unit.patch
Type: text/x-patch
Size: 1202 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20160208/5d029016/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: backport-v215-Fix-sigchld-handling-for-invalid-unit.patch
Type: text/x-patch
Size: 1277 bytes
Desc: not available
URL: <http://alioth-lists.debian.net/pipermail/pkg-systemd-maintainers/attachments/20160208/5d029016/attachment-0005.bin>


More information about the Pkg-systemd-maintainers mailing list