[Babel-users] Babel instability in WLAN-SI

Mitar mmitar at gmail.com
Mon Dec 14 23:33:07 UTC 2015


Hi!

On Mon, Dec 14, 2015 at 11:39 AM, Juliusz Chroboczek
<jch at pps.univ-paris-diderot.fr> wrote:
> I'd like more evidence that this is needed.  Estimating packet loss is
> very slow (since we're computing a metric from what is just a discrete,
> one-bit signal), so it slows down convergence.  Hopefully we can get away
> without it.

Hm, isn't computation on WiFi links exactly the same?

> Does your RTT increase at the same time as packet loss?  If so, we could
> probably do without packet loss.

Not really. At least on the fiber links (which are most of our VPN
links) it does not.

> (Recall that the goal is not to have an accurate model of the real
> world -- the goal is to have traffic flow according to optimal paths.  If
> the traffic is going where you want it to go, there's no need to add more
> complexity to babeld.)

Currently it seems that the routes over VPN are dropped while we would
prefer them to stay up, even if there is a slight packet loss.

>> Why have you disabled packet-loss metric on VPN links?
>
> Because it's an experimental feature, that hasn't had enough real-world
> deployment.  It works beautifully in our tests, it works beautifully in
> Nexedi's network, and if it works as well in your network, I'll enable it
> by default.

Hm, are we talking here about packet-loss or delay-based routing
(RTT)? I understand that RTT metric is experimental, but I was talking
about packet-loss, why is that not enabled. Or am I missing something?

> First, it slows down reaction to link failures.  If you're on an Ethernet,
> and you lose two packets in a row, you can be pretty sure the link is
> down

Or we have a very short buffer. ;-)

But yes, maybe our recent instability in VPN links is more to the
problems with routing we have, then really link instability.

But we do have VPN links which go between countries. We have observed
really crazy stuff on for example links between Croatia and Slovenia.
Sometimes extra 100 ms appears on the link, because they have some
issues at Internet exchanges, for example (so delay is added at the
Internet exchange).

I do not think that VPN links should be seen as Ethernet. For Ethernet
I agree that if you loose two packets you have probably issues. But
for VPN you have stuff in between, from bad ISPs, to MTU issues which
make some packets get lost (especially while PMTU is in progress).

> Second, the link quality estimator uses ETX, which is optimised for
> multicast Hellos over WiFi links (it's quadratic in loss rate).
> A different formula should be used for lossy wired links and for unicast
> wireless tunnels.  (But then, perhaps ETX works well enough on tunnels --
> I have no idea.)

We have been using ETX with OLSRv1 on tunnels without visible issues.

What do you use for ETX? ETX = 1 / (d_f x d_r) is for unicast (as
described in the A High-Throughput Path Metric for Multi-Hop Wireless
Routing paper). To my knowledge for multicast you should use ETX = 1 /
d_f (as described in the
High-Throughput Multicast Routing Metrics in Wireless Mesh Networks paper).

So we know ETX equations for both unicast and multicast. Maybe Babel
should support both?


Mitar

-- 
http://mitar.tnode.com/
https://twitter.com/mitar_m



More information about the Babel-users mailing list