[Pkg-sysvinit-devel] Bug#742822: Bug#713135: startpar-bridge causes rc to hang with a variety of job types and situations

Steve Langasek vorlon at debian.org
Sun Mar 30 20:22:28 UTC 2014


Hi Cameron,

On Sun, Mar 30, 2014 at 05:56:01PM -0007, Cameron Norman wrote:
> > Problem probably is in startpar-bridge. Since last mail also
> samba-ad-dc service/job is affected. Both services upstart starts
> properly:

> Upstart starts those jobs properly *according to Upstart*. The
> problem here is that startpar has a different definition of starting
> properly, and a lot of different service semantics.

> > But both jobs ends quickly after start what probably causes
> startpar waits like this:

> Exactly. binfmt-support is an apparent incompatibility. In Upstart,
> binfmt-support is a task. It runs once and is considered to be
> successfully run by Upstart. startpar has something different to
> say. Although startpar sees that the job has been run, it ignores
> that part because the job has also been stopped. So it considers the
> fact that the job is last stopped to mean that services that start
> after binfmt-support must not be started, when in reality, they can
> do so easily.

I don't think this is accurate.  The problem is almost certainly that the
startpar-bridge events for binfmt-support are triggering while startpar is
not running, causing startpar itself to be unaware that the binfmt-support
job has ever started.

The startpar-bridge exists to pass information to startpar whenever an
upstart job starts or stops.  But when it starts up, startpar itself reads
the current state of the upstart jobs so that it has an accurate view of
those jobs that are already started or already stopped.  The problem is
that, with a 'task' job, "already started" is indistinguishable from
"already stopped".  And since binfmt-support is 'start on filesystem', at
best it races the initialization of startpar; at worst it always completes
before startpar because you have network devices that are slow to initialize
(holding up rc but not holding up binfmt-support).

So the fix is to make the binfmt-support job not a task.

> Another situation is where the daemon detects that it is not enabled
> and exits early. This is what I think is happening in samba-ad-dc.
> Although the startpar started init script would do the same thing,
> startpar ignores the init script's stop because that is just how
> startpar works!

What does startpar do when an init script exits non-zero (indicating that
the service did not start successfully)?  That's effectively the error
condition we're talking about here.  If some other init script has a
dependency on samba-ad-dc, and samba-ad-dc doesn't start, I think the only
correct thing to do is to treat that as a "permanent failure" to satisfy the
dependency and refuse to start any of the services that depend on it.  But
this doesn't seem to be implemented in startpar.

Please note, btw, that startpar-bridge was always intended to be a temporary
solution on the path to a full migration to upstart.  Given that such a
migration is now exceedingly unlikely to happen in Debian, I don't expect to
be putting much effort into resolving bugs like this.  But if someone wanted
to do so, the way to go about it would be to first ensure startpar would
first do something sensible when a depended-on service doesn't start.  (Of
course, in practice lots of init scripts undermine this by exiting zero when
configured not to start, but at least for upstart we could in principle
handle this differently.)

> Steve, could you give your thoughts on how to go about improving
> this situation? Furthermore, could you point me to the
> startpar-upstart-inject source code? I could not find it in
> Upstart's or sysvinit util's source trees.

In unstable, this has been moved to the 'startpar' package.  It's also
apparently been renamed, without discussion with me, to
'startpar-injector'...

Petter, please revert this change of the startpar-bridge name.  Upstart
systems absolutely *must not* have two versions of the startpar bridge job
installed on the system at the same time, this will cause busy loops between
the two!

-- 
Steve Langasek                   Give me a lever long enough and a Free OS
Debian Developer                   to set it on, and I can move the world.
Ubuntu Developer                                    http://www.debian.org/
slangasek at ubuntu.com                                     vorlon at debian.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-sysvinit-devel/attachments/20140330/079c5f29/attachment.sig>


More information about the Pkg-sysvinit-devel mailing list