[Babel-users] Route-dete :wq

Dave Taht dave.taht at gmail.com
Mon Mar 12 00:21:33 UTC 2018


On Fri, Mar 9, 2018 at 3:29 PM, Christof Schulze
<christof.schulze at gmx.net> wrote:
>> I start running into trouble with 1000+ routes using 1Mbit mcast.
>> Sooner if I seriously
>> slam the network with flent or something else that abuses mcast like mdns.
>> YMMV.
>
> So what is the culprit here? What would it take to add an order of
> magnitude?

Most of the meshy networks run wifi mcast at 11Mbit or higher. Ath9k
devices support this, many others don't.

Experiment with the new unicast code, as that will transfer routes at
the underlying rate of the medium (up to 300Mbit on wireless-n).

Don't run odhcpd or network manager either. They tend to get in a
tussle with babeld in the kernel.

Simulate first, deploy second. Don't prematurely optimize. :)

I have a backlog of other optimizations for babel lying about, ranging
from trivial stuff like at least logging when you are low on compute
or overbuffered:

https://github.com/dtaht/rabeld/commit/b74b4a6f9b532717ee93346963efd894e94615b3
 to something that tries to be more aware of compute bounds, to
another thing that pushes some work down into the kernel via bpf



>
>> As for aggregation and filtering: Most of my Aps have at minimum,
>> ethernet, and two channels - usually four, including the meshy links.
>> The meshy links are ptp, so I've generally "wasted" an entire /22 ipv4
>> network to talk to the Aps. ipv6 /62s.
>>
>> The lab component of my network, for example, has two main links to
>> the production net, and the gateways only announce the
>> subnet it is on (172.22.0.0/16). This cuts the churn seen outside the
>> network when I do crazy things like reboot the whole thing.
>>
>> The biggest problem I've run into, is that meshy links, are, meshy -
>> and I've lost track of the number of times where
>> I had a well defined /16 network in the lab suddenly leak all the
>> meshy /32 bits over the worst possible link - because I plugged
>> something in that was adhoc (and poorly) connected to the outside
>> network that I shouldn't have.
>>
>> Lede creates one /48 ULA by default per AP, and then more /60s. I've
>> had a tendency to try to share one /48, but more recently I was trying
>> to go native ipv6 and disabled the ULA generation entirely.
>>
>> I don't bridge anything except sometimes on the last Aps on a link
>> (which don't announce babel on that bridge). Bridging can do weird
>> things to daemons that want also to be measuring the individual links.
>>
>> So in your network design I'd try to identify your backbone links and
>> try upfront to rationally partition the network numbering scheme,
>> and still, periodically try to optimize it. It makes no sense to
>> export all the churn the last hop of a meshy, yet leaf network can
>> have to the whole network. I'd simulate what you plan, and then slam
>> it with traffic from every point with a tool like flent, and deploy
>> cake or htb+fq_codel on the ISP up/downlinks.
>
> This being a Freifunk-Network, there is not going to be much planning of the
> structure beyond mere basics.
> Until now I just hoped we could get away with having 10K+ routes in one
> network which would translate into 3k+ clients when considering many nodes
> and 3 IPv6 addresses for each client which seems to be a reasonable amount
> when taking into account a clat-address per client and IPv6 privacy
> extensions.

As toke noted, just distribute the /64. Source specific routing is cool, too,
you should try distributing real ipv6 ranges from two or more gateways.

> There are approaches to reduce the amount of routes per client including
> using nat66 on each node. You certainly are making it sound like there
> should be put some thought into reducing the amount of routes. This will be
> the next step after we have more than just a few nodes / clients inside the
> same network.
>
> BTW: the Freifunk networks use an autoupdater. This might just solve your
> tree-climbing-problem in the long run...

Across 6 versions of the OS in 6 years of deployment, and 5 different
generations of hardware, my automation problem is hopeless.

Of all the gear I've had to date the nanostation 5s, wndr3800s, and
picostations have been the best. I'm having good results
thus far with the ubnt UAP-AC-M-USes with the candeletech firmware -
aside from not having enough flash for a web interface.

>
> Cheers
> Christof
>
>>
>> I'm working these days, on making netem better emulate wifi's
>> behaviors. I'm not satisified with it yet.
>>
>
>>> Note that babeld currently sends updates as a single burst when the upate
>>> interval expires (the same is true of Toke's implementation of Babel, as
>>> far as I'm aware).  For very large networks, it would be good to split
>>> updates into one-packet pieces that are sent throughout the update
>>> interval.  I'd be glad to accept a patch that does that.

I'd rather like to keep the burst but measure how long it takes to transmit.

>
>>>> * making babel trigger updates on newly appeared routes
>
>
>>> I've gone through different approaches for scheduling updtes, and the
>>> current master is more aggressive with scheduling updates.  I'd need to
>>> check to make sure, but I believe it already does what you suggest.  If
>>> you have time, could you please check if current master improves things;
>>> if it doesn't, we need to work together to improve the implementation (no
>>> protocol changes will be needed).
>
>
>>> You could also try Toke's implementation, which is very well written.
>
>
>>>> * communicating the appearance of a route across the network outside
>>>> babel and inserting that at the gateway
>
>
>>> I'm not sure what you mean.
>
>
>>>> What do you think about those approaches?
>
>
>>> Please try current master.  If not, we'll need to think together about
>>> redesigning our approach to sending triggered updates.
>
>
>>> -- Juliusz
>
>
>>> _______________________________________________
>>> Babel-users mailing list
>>> Babel-users at lists.alioth.debian.org
>>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
>
> --
> ()  ascii ribbon campaign - against html e-mail
> /\  against proprietary attachments
>
>
> _______________________________________________
> Babel-users mailing list
> Babel-users at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619



More information about the Babel-users mailing list