exim4 untainting and mailman: local_parts = dsearch + local_part_data

Tue Oct 28 16:49:59 GMT 2025

[Yeah, now I found out that Philip retired many years ago,
resending accordingly to Exim maintainers who I hope will see this.
Also +pkg-exim4-maintainers at lists.alioth.debian.org since they did
what they could to mitigate the situation I explain below, thank you
Maybe one of you can forward to the appropriate devs you think could
benefit from the read]

Dear Maintainers, my intent is not to point the finger at you and
complain loudly, but to share what I've learned from the admin world
(more details in the messages I pointed yesterday but I don't want to
repeat them now) in the hopes to share how things work on the end side
of your software :) If you are willing to read, thank you. You do not
have to answer,
I'm only looking at sharing a blameless post mortem we can hopefully
learn from, which was standard at the company I worked at.

With all respect and thanks I owe you for your work, please see my
feedback below on how things work in the sysdadmin/SRE world and that
most often people upgrade around once a year, hundreds of packages all
at once whenn their distro does.
Each package that then falls over and breaks the system during such
upgrades is a major cause of stress and often down time for cases
where people have to upgrade systems in production.
In the case of exim, the upgrade experience/breakage with virtually
0 help in the official package to help the now paniced admin with
a system down, was so bad that it just got exim reverted to the
vulnerable version.

As I said yesterday, Debian gets points for trying to ease that upgrade,
but they were still lacking that big taint_ugpgrade.readme doc with what
I'm describing below (from exim, not Debian).
Andreas Metzler: gold star for doing your best to limit the damage
in debian. Sadly README.UPDATING only had a mere few lines about
using local_part_data which simply never worked for me and I found
no info then on how to make it work in the middle of the night, so
I reverted.

On Wed, Oct 29, 2025 at 01:50:40AM +1300, Jasen Betts wrote:
> 99% of the time one of the lookups can untaint the value for you.
> maybe dsearch if it's meant to match a filename. nis, ldap, or lsearch
> on /etc/passwd if it's a username. dnsdb for a domain-name... 

Right, I get that now, but the biggest issue with this upgrade is 
it assumes admins know or remember complex config directives that 
as a whole took me close to a week to know and understand all when
I was a full time exim admin (now more than 20 years ago, so yeah
I forgot most of it), at potentially a very inconvenient time
and did not give them a very clear document on simple ways to recover
from their config not working anymore (never mind that no useful error
was even logged anywhere, making the problem much much worse).

Someone will point out that good admins have a cronjob to auto apply
security updates automatically, but a major bind package snafu that 
broke likely millions of machines overnight, and stuff like this 
upgrade is the exact reason why I never trust unattended updates 
anymore. Ideally I apply each and every one of them out of band, which
would be my job at work, but less likely if it's a personal system
maintained on the side on spare time.

> Why do you want to accept list names that do not exist? how would that
> be safe?

Because I was already not accepting them in a syntax that has worked for
25 years, but that exim didn't know/understand I was doing.
Really my biggest reason for being upset is that exim refused to give
me an option to say "I know the taint issue, it is not an issue for
me with my working config, and if I'm wrong, by turning master_untaint
on, I accept all responsibilities that come from it.

> Exim is looking for a strong reason to trust tainted data. It finds
> that by by matching it to something that already exists, and then 
> using the something.

Yeah, I get that now, but basically that implementation is way too
restrictive/over the top as something that people get as an upgrade,
totally breaks them in a very non obvious way (like I said, nothing in
panic.log, nothing in any log that gives any clue whatsoever, except
d+all if you know how to read it and can stot a faint clue amongst
hundreds of lines of output, potentially in the middle of the night).

I realize I'm late to the party, and now I know it's exactly because
exim broke so badly with 0 clue on what to do or what to fix that I
reverted it after wasting several hours searching in the middle of the
night, that I reverted it to the last good known version, went to bed,
and went on with my life the next day.

there are many things that could have been done better and that I hope
are taken into account for any such future thing if it ever happens.

1) don't assume the admin knows or understands everything about exim or
our their config file works, when they are applying a security update.
That config could be many years old and exim is complex enough that
anyone not working on it routinely cannot be expected to remember how
everything works. The admin needs simple actionable instructions
outlined in a very clear error message in the log (both were missing)

2) this upgrade assumed that no admin can be trusted and that it's
exim's job to save them from their own ineptitude. In real life what it
achieved was to have itself reverted to an old vulnerable version by not
providing easy ways to the admin to be secure and compliant with the new
system (the untaint function I gave with a good known safe regex, or
local_part_untainted that has this built it).

3) local_part_data, if it had to be the only way, and it's not, should
have come with an upgrade document stating how it works, why it's there,
why it's null by default, and all the ways to populate it, along with
examples for the top 5 or 10 types of config that you can just cut and
paste in the middle of the night when you're down because of this
upgrade.

And I'm not going to try to be cute, but what if I have a config that
creates users on Email creation, like some sort of greylisting /
teergrubbing honeypot ?  
>From what I understand it's now 100% impossible to run this.

It's not like exim is the first daemon dealing with untrusted data, we
can all look at what PHP did (and PHP was clearly not an example of
security), and the many ways you can untaint input with good known
regexes, which in the case of exim can be supplied hardcoded by the MTA.

Thanks for reading for those who did.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.

Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08