Bug#837535: When a relay host lacks an IPv6 interface, mail heisenbounces
Thaddeus H. Black
thb at debian.org
Mon Sep 12 09:43:40 UTC 2016
Package: exim4-config
Version: 4.84.2-1
I doubt that this bug, which affects only some users, will be
your most urgent bug to fix. However, the fix is subtle.
Since I can now describe the fix (though I have not attached a
patch), I report the fix here. The fix can be implemented and
packaged whenever you have time, whether before or after
stretch's release.
When an Exim4 relay host lacks an IPv6 interface, mail
heisenbounces -- that is, it bounces sporadically, due
apparently to an obscure timeout or a race condition.
Full details are discussed here
[http://unix.stackexchange.com/q/308283/18202], where a user named
Rui F. Ribeiro cleverly discovers the fix.
LOGS AND CLUES
Here is a typical excerpt from my relay host's
/var/log/exim4/mainlog:
2016-09-11 18:31:30 H=dpc6935235115.direcpc.com (localhost) [69.35.235.115] X=TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128 F=<thb at b-tk.org> rejected RCPT <thaddeus.h.black at gmail.com>: relay not permitted
2016-09-11 18:32:46 1bj9Ya-0006aq-Jt <= thb at b-tk.org H=dpc6935235115.direcpc.com (localhost) [69.35.235.115] P=esmtpsa X=TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128 A=plain_server:thb S=762 id=20160911183227.GA1596 at b-tk.org
2016-09-11 18:32:46 1bj9Ya-0006aq-Jt gmail-smtp-in.l.google.com [2607:f8b0:400d:c04::1b] Network is unreachable
2016-09-11 18:32:47 1bj9Ya-0006aq-Jt => thaddeus.h.black at gmail.com R=dnslookup T=remote_smtp H=gmail-smtp-in.l.google.com [74.125.29.26] X=TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128 DN="C=US,ST=California,L=Mountain View,O=Google Inc,CN=mx.google.com" C="250 2.0.0 OK 1473618767 u126si8954672qkc.66 - gsmtp"
2016-09-11 18:32:47 1bj9Ya-0006aq-Jt Completed
In this log excerpt, two emails are sent. The first email
bounces at 18:31:30. The second email, sent to the same
recipient at 18:32:46, passes through. The recipient (which
in this instance happens to be a Gmail account I control)
receives the second email promptly. The first email however
never arrives.
The clue in this log is obscure. The aforementioned Mr.
Ribeiro is a fine detective to find it. Look at the third of
the five log lines, "Network is unreachable". The IP address
the line mentions is an IPv6 that belongs to the relay
recipient.
SO WHAT? WHO CARES?
But who cares? The email goes through to the relay
recipient's IPv4, instead, one second later, right?
Answer: Yes, the email does go through; but the *other* email
does not go through. What is confusing is that
the IPv6 "Network is unreachable" is reported only for the
email that does indeed go through, whereas the error turns out
to be more relevant with respect to the other email.
This is why this bug was so hard to diagnose. Some obscure
interaction between my nonexistent IPv6 interface and the
authenticator -- or some other interaction of the kind -- was
apparently instituting a timeout or race condition.
THE FIX
Once the bug is diagnosed, the fix is fairly
straightforward: if the relay host has no IPv6 interface, then
in /etc/exim4/exim4.conf.template, add "disable_ipv6 = true" to
the section main/01_exim4-config_listmacrosdefs. The neat way
to do this would probably be to add a suitable new parameter to
/etc/exim4/update-exim4.conf.conf, perhaps with "low" debconf
priority. On my own host, however, to save time, I have just
manually added the line and invoked "dpkg-reconfigure -phigh
exim4-config", bypassing update-exim4.conf.conf.
REMARKS
To be clear: this bug matters only if the relay host lacks
an IPv6 interface and (as I believe) the relay recipient has
an IPv6 interface. If you think about it, though, Exim4 should
probably not be exercising a nonexistent IPv6 interface,
anyway, should it? We never knew that this was an actual
problem, but now it turns out to be an actual problem, at least
for some users, or at any rate for Mr. Ribeiro and me.
Additional information: my relay server is plaintext password
protected after STARTTLS on port 587. (The X.509 certificate
happens to be a real one, not a snakeoil, but this is probably
not relevant to you.)
I have not tried the fix on sid, nor indeed have I verified the
bug on sid. Reviewing post-jessie changelogs however, I see no
entry that would already have fixed this. Thus, as far as I
know, the bug remains current.
If you have questions (whenever you get around to addressing
this bug, this year, next year, some year), let me know. I'll
be here. Meanwhile, users who discover this bug report and are
affected by the bug can straightforwardly implement the fix for
themselves.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-exim4-maintainers/attachments/20160912/c6faf14d/attachment.sig>
More information about the Pkg-exim4-maintainers
mailing list