Bug#711213: libapache2-mod-perl2: occasional core dumps after the test suite
Stefan Fritsch
sf at sfritsch.de
Fri Jun 14 08:24:07 UTC 2013
On Friday 14 June 2013, Niko Tyni wrote:
> On Sun, Jun 09, 2013 at 11:23:01PM +0300, Niko Tyni wrote:
> > On Fri, Jun 07, 2013 at 02:23:43PM +0300, Niko Tyni wrote:
> > > I can reproduce the SIGSEGV at the end of the main test suite
> > > (#711213) on amd64. The armel problem might well be related,
> > > as the log ends at the same point.
> >
> > I'm somewhat further now: what happens is that
> > register_auth_provider() in modperl_util.c calls
> >
> > apr_pool_pre_cleanup_register(pool, NULL,
> > cleanup_perl_global_providers);
> >
> > once in the parent process, then another time in a child. For
> > some reason that I do not understand yet, the
> > cleanup_perl_global_providers() function resides at a different
> > memory location (with a 0x2c000 offset or so) on the second
> > time. The first location has at that point become an invalid
> > memory address, resulting in a SIGSEGV when libapr calls the
> > registered cleanup functions and jumps into the old location.
>
> Another progress report. I now mostly understand what's happening.
> Contrary to the above, all the interesting stuff happens inside the
> parent process.
>
> Cc'ing the apache2 maintainers; any ideas? See below.
> (The jump to an invalid address is crashing armel buildds so it's a
> rather big problem ATM. See #711167, where this has diverged.)
>
> First, apache2 main() calls read_config() (from main.c:624), which
> loads all the modules. Loading mod_perl installs the pre_cleanup
> hook cleanup_perl_global_providers() as above.
>
> Then, there's a loop starting at main.c:704 that has this comment:
>
> /* This is a hack until we finish the code so that it only
> reads * the config file once and just operates on the tree already
> in * memory. rbb
> */
>
> and calls apr_pool_clear(pconf), which unloads the modules and
> should do all the cleanup AIUI. A bit later, at main.c:724,
> ap_read_config() is called again, and under some conditions (when
> stack limit is 'unlimited' and the number of modules is
> suitable?), mod_perl gets loaded at a different place than the
> first time. However, the earlier installed pre_cleanup hook is
> still in place, so we jump into an out-of-bounds location (where
> cleanup_perl_global_providers() used to reside) in the end when
> the cleanups are actually called.
The problem is that MP_CMD_SRV_DECLARE2(authz_provider) and
MP_CMD_SRV_DECLARE2(authn_provider) register the cleanup against
parms->server->process->pool which lives longer than the pconf pool
and therefore the load time of the mod-perl shared object. It should
probably use parms->pool (which is pconf) instead.
In general, everything mod_perl does should be undone by the
clearing/destruction of pconf, because the the .so will be unloaded
after that. server->process->pool can be used to store things that
need to be preserved beyond the unloading/loading of the .so, however
there is now also a higher level api for that
(ap_retained_data_create). Registering a cleanup with server->process-
>pool is always bad from a module because the code may move.
Now, if there is a good reason that the above functions use server-
>process->pool, we need to figure out a way to fix that. But the
original commit of that code has no comment with respect to the pool
requirement. Therefore I think it may be simply a bug and you should
test it with a cleanup against pconf, first.
> So I suppose mod_perl should somehow register a "module uninstall
> hook" that calls apr_pool_cleanup_run(...,
> cleanup_perl_global_providers, ...) [or apr_pool_cleanup_kill(),
> not sure] to remove the to-be-unloaded pre_cleanup hook. I haven't
> found a way to do that yet.
If you register a pool cleanup with pconf, it will be called before
the .so is unloaded.
Cheers,
Stefan
More information about the pkg-perl-maintainers
mailing list