Bug#355637: [Pkg-mailman-hackers] Bug#355637: mailman: Stale lock
files break administrative web interface
Shannon C. Dealy
dealy at deatech.com
Tue Mar 7 19:20:10 UTC 2006
On Tue, 7 Mar 2006, Lionel Elie Mamane wrote:
> tags 355637 +upstream
>
> thank you for your bug report.
>
> On Mon, Mar 06, 2006 at 12:55:44PM -0800, Shannon Dealy wrote:
>
>> Under some circumstances (presumably mailman software or system
>> crashes), list specific stale lock files are left in the directory
>> /var/lib/mailman/locks this can permanently prevent administrative
>> login for that specific list until the lock file(s) are
>> removed. There appears to be no mechanism to cleanup these stale
>> lock files, and restarting mailman or even rebooting the system does
>> not clean things up. At the very least restarting mailman should
>> cleanup these stale lock files,
>
> What do you mean with "restarting mailman"? The only interpretation I
> can find is restarting the queue daemon (the effect of
> "/etc/init.d/mailman restart"). But there is still the Apache (or
This is what I meant.
> other http server) running mailman CGIs. I don't think that merely
> restarting the mailman queue daemon should summarily remove the lock
> files: Apache is still running, and may be running a Mailman CGI
> genuinely holding that lock for an operation.
I wasn't sure if the CGI scripts manipulate things directly or do their
work through socket connections to the many daemon processes that always
seem to be running, however, there is presumably some form of lock
manager being used, and it would seem that a restart (as you specified
above), is an appropriate time for the lock manager to be run and assess
the validity of all of the locks, whether it kills all CGI's in progress
and wipes the locks or merely checks that they are for existing processes
and not to old (hung CGI processes) and then cleans up accordingly. The
main point here is that the worst case senario should be that if something
is messed up, restarting mailman should clean it up. The better solution
is of course active monitoring which fixes the problems as they occur
rather than requiring manual intervention.
The important thing here is that simple as the problem was to fix once I
figured out what was going on, recognising, finding, and fixing a problem
of this sort is completely beyond the capabilities of the overwhelming
majority of people running this software, though hopefully this bug report
may help some people sort it out if this doesn't get fixed. I did
shutdown mailman and kill all active mailman CGI processes before
deleting everything in the /var/lib/mailman/locks directory. Due to the
bug however, one mailman process refused to shutdown and had to be taken
out with a kill -9.
>> in particular what I assume is the master lock: listname.lock and
>> probably the actual source of my problems. A better solution would
>> probably include actually checking the lock files periodically to
>> make sure they are still valid.
>
> Yes.
>
> You may be hit by something like
> http://mail.python.org/pipermail/mailman-developers/2006-January/018506.html
>
> Upstream doesn't seem very eager to track down that kind of issues :-(
Looking at this posting seems to imply that there is no central lock
management code (or that it is incompletely implemented). Proper design
of locking would normally imply the lock is automatically released when
the thread terminates unless it is released earlier or explicitly
requested to otherwise be held (perhaps for a daemon to clean up later but
this is usually a bad idea), though even if it were implemented properly,
there must be some recovery mechanism for power failures at inconvenient
times and other "hard" crashes of the software.
Unfortunately, I don't really know Python yet or have the time to look
into this further.
FWIW.
Shannon C. Dealy | DeaTech Research Inc.
dealy at deatech.com | - Custom Software Development -
| Embedded Systems, Real-time, Device Drivers
Phone: (800) 467-5820 | Networking, Scientific & Engineering Applications
or: (541) 929-4089 | www.deatech.com
More information about the Pkg-mailman-hackers
mailing list