[Babel-users] babeld slashes kernel route manipulation performance by 17000%
Daniel Gröber
dxld at darkboxed.org
Thu Apr 14 15:25:51 BST 2022
Hi Toke,
On Thu, Apr 14, 2022 at 12:12:36AM +0200, Toke Høiland-Jørgensen wrote:
> How about submitting this report to netdev and asking for advice there?
> From a quick glance at the kernel fib code, this does not look like it's
> an easy fix (if it can be fixed at all), but we should really get
> someone who is an expert in the kernel routing code (which I'm not,
> sadly) to weight in. You could add an explicit Cc to David Ahern
> <dsahern at kernel.org> when doing submitting, and please keep me in Cc as
> well. Or if you'd prefer, I can submit the report on your behalf?
I'll try to get around to that but no promises :)
Do you know David? I don't like just CCing people I don't know at random.
> As for why you're seeing this in particular when Babel is running, now
> that we know the route dump is the culprit, it's quite obvious: While
> Babel listens for new route notifications from the kernel, it doesn't
> actually use those notifications directly; instead, it just sets a flag
> (see kernel_route_notify() in babeld.c), and does a full dump whenever
> it gets a notification. Which obviously interacts really badly with lots
> of routes being inserted at the same time, as that will basically send
> Babel into a loop of doing nothing but route dumps.
I saw that too and I was poking at the babeld code for a while before
settling on the iproute2 reproducer, also compared it quite closely with
bird and I can't say I really see a difference in what they do other than
netlink buffer sizing.
Both will periodically dump the whole table so if I had two instances of
bird running concurrently I could experience the same problem as it seems
to be the recvmsg call that's blocking forever in the kernel while the
table churn is going on so it's not even related to babeld doing a
quadratic number of dumps or anything.
What is also interesting is that babeld already seems to correctly filter
the notifications by table id so all my route churn never actually sets the
kernel_routes_changed flag (see parse_kernel_route_rta import_tables check
at the bottom).
> Bird does things a bit differently: it will directly update its internal
> routing table from the netlink notification messages, and only does a
> full dump at intervals (by default once every minute, but it can be
> configured to run entirely without dumps).
Right but the important part is that it does very much still do the dumps
:)
Also I wonder how netlink buffer overruns are dealt with when there isn't a
periodic dump? Wouldn't it still have to do a full dump to resync if that
happens?
> AFAICT the babeld code will require quite a bit of surgery to change
> this behaviour; to the point where I think it may be simpler to
> implement the RTT extension in Bird (but I'm obviously biased here)... :)
In order to scale the number of native babel routes further you're probably
right but that's not necessary for my use-case anyway. If this kernel bug
goes away babeld would still work fine IMO.
I'm currently working on babel ECMP support in bird though maybe I'll have
a stab at RTT after that.
--Daniel
More information about the Babel-users
mailing list