[Babel-users] resuming 1.8.3 deployment

Dave Taht dave.taht at gmail.com
Thu Oct 4 16:40:19 BST 2018


On Thu, Oct 4, 2018 at 12:47 AM Valent Turkovic
<valent at otvorenamreza.org> wrote:
>
> Don't worry Dave, your insights are very valuable.

I'd meant to write up all I'd learned about running this network (and
all I got wrong) years ago, but lost heart when it hit 23 pages. I
didn't spend any time on this campus all summer (I got a 36 ft
sailboat for christmas), and didn't get to it, again. Certainly the
bridge txcost thing was part of my own "institutional knowledge" that
I'd needed to preserve. maybe I'll use the github wiki pages to try
and iterate on
pieces of it.

That said, it's still probable I'm getting things wrong in multiple
respects. The 2/3s of it currently running (well, in the heyday I had
about 80 machines on it, 40 in the lab)

As one example I went nuts on optimizing out routes, exporting a ton
of covering routes. This was based on trying to hold routing
announcements to a bare minimum. One of the issues
is with a dv protocol, you don't get a picture of the whole network,
and no doubt I'm still doing many things wrong. within the lab today I
see ~ 60 routes total

ipv4:

root at gw1:~# ip route
default via 172.23.252.1 dev eth0 proto static # local comcast gw
24.6.136.0/21 via 172.22.0.1 dev br-lan proto babel onlink # exported
from the new comcast gw
50.197.142.144/29 via 172.22.0.1 dev br-lan proto babel onlink  # yet
another comcast gw
# arguably I don't need to export the addresses of the gateway networks
# I used to have 3 more comcast boxes but...
172.20.2.0/24 via 172.22.0.1 dev br-lan proto babel onlink # main cerowrt gw
172.20.2.5 via 172.22.0.1 dev br-lan proto babel onlink  # router to
elsewhere. I export this so that the common switch here can still
route if the cerowrt gw goes down. I used
to not do this. I sometimes have tried to export another covering
route from each box on this network
172.20.2.95 via 172.22.0.1 dev br-lan proto babel onlink # elsewhere
172.20.4.0/23 via 172.22.0.1 dev br-lan proto babel onlink  # an AP
with 2.4ghz and 5ghz routed separately
172.20.6.0/24 via 172.22.0.1 dev br-lan proto babel onlink  # another AP
172.20.6.1 via 172.22.0.1 dev br-lan proto babel onlink  # Mistake on
the 1.8.3 box
172.20.142.6 via 172.22.0.1 dev br-lan proto babel onlink  # p2p.
Originally this network was 172.20/16 and 142 was my p2p links, 143
p2p to the aps
# later on in other generations I widened it to 172.20/14 just because
I couldn't think clearly in
172.20.142.12 via 172.22.0.1 dev br-lan proto babel onlink # p2p. I
keep these globally because I like to see what's up and down,
but I could certainly see somehow filtering these out at some point (?)
172.20.142.187 via 172.22.0.1 dev br-lan proto babel onlink  # p2p
172.20.142.242 via 172.22.0.1 dev br-lan proto babel onlink # p2p
172.20.143.6 via 172.22.0.1 dev br-lan proto babel onlink # p2p
172.20.187.0/24 via 172.22.0.1 dev br-lan proto babel onlink # ap
172.20.222.0/24 via 172.22.0.1 dev br-lan proto babel onlink # ap -
mistake, should have a /23
172.20.223.0/24 via 172.22.0.1 dev br-lan proto babel onlink #
172.20.240.0/23 via 172.22.0.1 dev br-lan proto babel onlink
172.20.240.1 via 172.22.0.1 dev br-lan proto babel onlink # another
mistake, should just be the /23
172.20.241.1 via 172.22.0.1 dev br-lan proto babel onlink
172.20.242.0/23 via 172.22.0.1 dev br-lan proto babel onlink # yea
172.21.0.0/22 via 172.22.0.1 dev br-lan proto babel onlink # the
downbelow office network
172.21.2.5 via 172.22.0.1 dev br-lan proto babel onlink # shared
switch so if the main router goes down
172.21.2.6 via 172.22.0.1 dev br-lan proto babel onlink # same
172.21.142.186 via 172.22.0.1 dev br-lan proto babel onlink # ptp
172.21.186.0/24 via 172.22.0.1 dev br-lan proto babel onlink # ap on that p2p
172.22.0.0/24 dev br-lan proto kernel scope link src 172.22.0.2 # local lab
172.22.0.0/16 via 172.22.0.1 dev br-lan proto babel onlink # this is
all I normally export to the rest of the campus
172.22.0.172 via 172.22.0.172 dev br-lan proto babel onlink #
spaceheater - test box running the babeld-xnor branch
172.22.140.0/22 via 172.22.0.1 dev br-lan proto babel onlink # apu AP
running the latest fq_codel wifi code
172.22.220.0/22 via 172.22.0.91 dev br-lan proto babel onlink # bird box
172.23.4.0/24 via 172.22.0.1 dev br-lan proto babel onlink # pool ap
172.23.4.1 via 172.22.0.1 dev br-lan proto babel onlink # mistake I
think, except that that box is meshy and connected over conventional
wifi to another client
172.23.48.0/23 via 172.22.0.1 dev br-lan proto babel onlink # back-40 ap with
172.23.64.0/23 via 172.22.0.1 dev br-lan proto babel onlink # another
uap-lite ap
172.23.99.3 via 172.22.0.1 dev br-lan proto babel onlink # I have *no idea*
172.23.128.0/22 via 172.22.0.1 dev br-lan proto babel onlink # an
edgerouter x connecting a few networks together and a comcast box
172.23.142.2 via 172.22.0.1 dev br-lan proto babel onlink # p2p - .23
are re all boxes now running 1.8.3 configured in a failover mode
172.23.142.3 via 172.22.0.1 dev br-lan proto babel onlink # p2p
172.23.142.4 via 172.22.0.1 dev br-lan proto babel onlink # pwp
172.23.142.6 via 172.22.0.1 dev br-lan proto babel onlink # p2p
172.23.142.10 via 172.22.0.1 dev br-lan proto babel onlink # p2p
172.23.142.48 via 172.22.0.1 dev br-lan proto babel onlink #
172.23.143.4 via 172.22.0.1 dev br-lan proto babel onlink # Ap p2p
172.23.244.0/24 via 172.22.0.1 dev br-lan proto babel onlink # little villiage
172.23.252.0/24 dev eth0 proto kernel scope link src 172.23.252.2 # local

(there's a whole segment of the network not visible currently)

Most of the p2p boxes get somewhere between 12 and 50mbits out of
them. Everything is HT20.
p2p boxes are mostly nanostations. More modern hardware has had *much*
less range, (it's a 110 acre campus), so things like the uap-lite and
uap-mesh-lite
which have dual radios (yea!) have only seen limited and depressing
levels of deployment. I *really* wanted to deploy a good dual radio
box for outdoor use,
but never found one. Certainly the uaps have been very reliable and I
use them as local APs. The -ct firmware has been solid for about a
year now.

ipv6:

One of the huge problems I ran into with ipv6 and babel and openwrt's
default configurations
was that I would end up exporting, like 12, ipv6 routes on every box,
and because some of the
ipv6 allocations were dynamic, not having good ways to collapse those
into a covering route
was a blocker.

adding potentially (40*80) = 3200 !!!!! ipv6 routes by default was how
I started getting into trouble. My
deployment crept up to about 600 routes as I tried to deploy ipv6 and
the network started getting
noticibly flaky. I set a goal of one babel packet per speaker til
those issues got sorted out.

So I gave up (then) on ipv6, and for example, disable ula generation
entirely on openwrt, and
only have a couple ipv6 and dhcp-pd things left (20 ipv6 routes total)
as legacy. Totally blocking
me finishing this deployment is that we seem to have a bug on mips and
x86 disabling comcast
dhcpv6-pd entirely,

https://bugs.openwrt.org/index.php?do=details&task_id=1763&string=ipv6&search_name=&type%5B0%5D=&sev%5B0%5D=&pri%5B0%5D=&due%5B0%5D=&reported%5B0%5D=&cat%5B0%5D=&status%5B0%5D=open&percent%5B0%5D=&opened=&dev=&closed=&duedatefrom=&duedateto=&changedfrom=&changedto=&openedfrom=&openedto=&closedfrom=&closedto=

and I *do* have a bunch of things configured to use ipv6 on that
segment of the network.

cerowrt project ended years ago. funding for hnetd ended years ago.
some folk on the cerowrt list are actually still trying to use that
stuff...

>
> I'm not familiar who is babeld maintainer so if there is someone more familiar with maintainer - do you know will 1.8.3 be available as an official package in OpenWrt package repos?
>
> On Oct 3 2018, at 10:43 pm, Dave Taht <dave.taht at gmail.com> wrote:
>
>
> there's still ~20 machines left to upgrade, 1/3 of the campus unreachable...
> but...
>
> http://flent-fremont.bufferbloat.net/~d/lupinnet.png
>
> Sorry for all the noise about the default route issue.
>
>
> --
>
> Dave Täht
> CEO, TekLibre, LLC
> http://www.teklibre.com
> Tel: 1-669-226-2619
>
> _______________________________________________
> Babel-users mailing list
> Babel-users at alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619



More information about the Babel-users mailing list