Bug#360696: exim4-daemon-heavy: Failed to get write lock for retry.lockfile / exim process hangs with 100% CPU

Andreas Metzler ametzler at downhill.at.eu.org
Thu Apr 20 17:46:21 UTC 2006


On 2006-04-04 Michel Meyers <steltek at tcnnet.dyndns.org> wrote:
[...]
> (meaning that exim has to retry on my side), the respective exim process
> gets stuck with 100% CPU usage and the only way to get rid of it is to
> kill it with signal 9.
[...]

Hello,
I have forwarded this upstream
http://news.gmane.org/find-root.php?message_id=%3codr6h3%2dm06.ln1%40argenau.downhill.at.eu.org%3e
and got the following response from Phil Hazel
http://news.gmane.org/find-root.php?message_id=%3cPine.LNX.4.64.0604201644510.16046%40xoanon.csi.cam.ac.uk%3e
| I found one bug when I first looked at this, but it isn't a processing 
| bug. It is just that it would always say "Failed to get write lock", 
| even when the failure was for a read lock. That was easily fixed.
| 
| I tried to simulate this problem by patching the code to pretend it had
| failed to get a lock when trying to update the retry database while
| tidying up after a 451 failure. (It is, in fact, a write lock here.)
| Needless to say, I did not get a 100% loop. It just did what it is
| supposed to do - that is, failed to update the hints. But of course I
| was using release 4.61, not 4.60.
| 
| I suppose we'll have to look at the configuration that was being used. 
| The given log had this:
| 
|> 2006-04-04 09:13:48 1FQfhL-00035E-Ay Failed to get write lock for
|> /var/spool/exim4/db/retry.lockfile: timed out
|> 2006-04-04 09:14:48 1FQfhL-00035E-Ay Failed to get write lock for
|> /var/spool/exim4/db/retry.lockfile: timed out
| 
| which suggests two tries for the same message, one minute apart. How
| often was the OP starting queue runners? I have a feeling this is going 
| to be a long haul...

Could you upgrade to 4.61 (it is now even in etch) and check whether
the problem still happens?

ould you also answer Phil's questions, i.e. show the configuration and
how often you are starting queue runners?

You can either send this configuration to the bug tracking system and
I'll continue to act as proxy or folowup directly on exim-users.

thanks, cu andreas
-- 
The 'Galactic Cleaning' policy undertaken by Emperor Zhark is a personal
vision of the emperor's, and its inclusion in this work does not constitute
tacit approval by the author or the publisher for any such projects,
howsoever undertaken.                                (c) Jasper Ffforde




More information about the Pkg-exim4-maintainers mailing list