[request-tracker-maintainers] Bug#595054: request-tracker3.8: Race condition between RT3.8+apache2 and MySQL when booting by insserv

Mon Sep 13 01:05:06 UTC 2010

On Sun, Sep 12, 2010 at 10:23:09PM +0100, Dominic Hargreaves wrote:

[...]
> I've raised this upstream:
> <http://lists.bestpractical.com/pipermail/rt-devel/2010-September/011169.html>
> and then at the mod_perl users' list (copy attached; it hasn't made it
> through to the list yet).
Thanks for dealing with this issue.

> In the absence of any better suggestions I think I will try and prepare
> a rough and ready patch whereby the webmux.pl script retries for a short
> while (1 minute) to connect to the database, and then gives up.

> If anyone reading has any other good ideas, please let me know.
Could the retry timeout be made tweakable via RT_SiteConfig.pm?
Supposedly RT has some sort of internal API to read the settings
recorded by the Set() function in that module.

[...]

After reading the thread on rt-devel, I'd like to summarise my thoughts
on this issue.
We generally have two classes of problems here:

The first one is with RT.
RT assumes the DB backend is always available throughout the lifetime of
its instance.
The question of whether this is OK or wrong is open. I think most apps
have the same assumptions. On the other hand, I think RT is may be
somewhat unique for web-driven apps in that it has a distinct
initialization phase which effectively turns it to some kind of a daemon.
Hence, RT has two distinct periods of its lifetime affected by the
availability of the DB backend: startup and requests coming via HTTP.
I assume requests simply fail if DB is not available, and this is
perfectly OK. Conversely, the startup phase is harder to deal with: if
there's no DB backend during startup, RT can:
1) Crash (no matter if it does crash the server or not), see below.
2) Try to reconnect. This can be done either indefinitely or once (twice
   etc -- what you propose).
3) Try to connect, if failed, remember this but proceed.

I think the best way for RT would be to eventually implement (3).
It could try to fetch the configuration just before serving the first
request, that is, to implement lazy initialization. This would not
remove the problem being discussed completely but would alleviate it.

And even if we mitigate (1) by deprecating mod_perl (and switching to
FastCGI), this would merely convert the problem to (2), and this
solution is not OK because to make it bullet-proof, indefinite attempts
to connect are required which would just render the HTTP server unusable
until the DB backend becomes available.

The second problem is with Debian.
You rightly state the hack you propose is quite rough.
RT expects the DB backend to be available at startup, and having the DB
backend on the same host is possibly the most common setup.
I don't think upstream will implement anything like soultion (3) above,
and even if it will, it won't be in a near future, I suppose.
Hence it would still be cool to have a way to enforce a certain startup
ordering in Debian, ideally not touching the init scripts.
So, may be we should contact the insserv maintainer(s) for cooperation
directly?