[Babel-users] Looping in EAGAIN

Dave Taht dave.taht at gmail.com
Sat Mar 28 16:43:24 UTC 2015


On Sat, Mar 28, 2015 at 9:30 AM, Juliusz Chroboczek
<jch at pps.univ-paris-diderot.fr> wrote:
>> Say wlan0 vanishes. All the routes going out that interface are no
>> longer valid, but from what I understood of this patch, it will loop for
>> a while, then give up.
>
> If wlan0 vanishes, this will be recognised the next time check_interfaces
> is run, and all neighbours visible through wlan0 will be flushed.
>
> The issue we're having is a race condition -- if wlan0 goes down and then
> back up before we run check_interfaces, and the IP addresses don't change,
> then check_interfaces will not notice the transition, and Babel will think
> that its routes through wlan0 are still up -- end you end up with a FIB
> that is not a subset of the RIB.  Ouch.

There is a bunch of work towards unifying ipv4 and ipv6 behavior that might
be worrisome OR helpful. See:

https://www.netdev01.org/docs/prabhu-linux_ipv4_ipv6_inconsistencies_talk_slides.pdf

A bunch of other presos from that conference were interesting

https://www.netdev01.org/downloads

> So it might be a good idea to run check_interfaces early when we get
> EAGAIN, but I'm not sure what consequences it might have -- EAGAIN can
> also happen when we're under load, and we'd rather not be repeatedly
> scanning our interfaces in that case.
>
> It would be better to get async notifications from the kernel about
> interfaces going down.

Amplified to netdev.

>> Not clear to me if this would happen for 4 hello intervals before the
>> interface is recognised as gone?
>
> No, the hellos are used to notice vanishing neighbours, not vanishing
> interfaces.  That's a completely different mechanism.

got it. Keeping calm and patching on!

>
> -- Juliusz



-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb



More information about the Babel-users mailing list