[Babel-users] Looping in EAGAIN

Juliusz Chroboczek jch at pps.univ-paris-diderot.fr
Sat Mar 28 16:30:04 UTC 2015


> Say wlan0 vanishes. All the routes going out that interface are no
> longer valid, but from what I understood of this patch, it will loop for
> a while, then give up.

If wlan0 vanishes, this will be recognised the next time check_interfaces
is run, and all neighbours visible through wlan0 will be flushed.

The issue we're having is a race condition -- if wlan0 goes down and then
back up before we run check_interfaces, and the IP addresses don't change,
then check_interfaces will not notice the transition, and Babel will think
that its routes through wlan0 are still up -- end you end up with a FIB
that is not a subset of the RIB.  Ouch.

So it might be a good idea to run check_interfaces early when we get
EAGAIN, but I'm not sure what consequences it might have -- EAGAIN can
also happen when we're under load, and we'd rather not be repeatedly
scanning our interfaces in that case.

It would be better to get async notifications from the kernel about
interfaces going down.

> Not clear to me if this would happen for 4 hello intervals before the
> interface is recognised as gone?

No, the hellos are used to notice vanishing neighbours, not vanishing
interfaces.  That's a completely different mechanism.

-- Juliusz



More information about the Babel-users mailing list