[Babel-users] babels bug with uninitialized data somewhere?

Dave Taht dave.taht at gmail.com
Fri Jun 27 21:10:47 UTC 2014


On Fri, Jun 27, 2014 at 1:40 PM, Juliusz Chroboczek
<jch at pps.univ-paris-diderot.fr> wrote:
>>> There's nothing obviously wrong.  Here's an example of a source-specific
>>> TLV (the one at time 06:20.362230 in the wifi capture):
>
>> So this does imply some sort of memory corruption issue on parsing
>> that "martian" packet?
>
> I don't want to make any guesses.  Matthieu's code has been through some
> churn lately, and we've fixed some minor bugs and typos.  Matthieu, what
> about putting the latest version in the public git?  I'm sure Dave can
> deal with our dirty rebasing habit.
>
>>> (you're probably the only person in the world who thinks /27 is a round
>>> number).
>>
>> 32 is a round number!
>
> Indeed.  It's the sum of two primes.

You have an off by two error in that statement, unless you were
discarding the broadcast address and base address.

>
>> I would certainly like merely to export the /24 (and ipv6 /61) to the
>> universe from each box but have never figured out how.
>
> First, install a blackhole or unreachable route for the whole /24.  It's
> a good thing to do in any case, since it will shoot any packets that are
> destined for an interface that's currently down and might otherwise follow
> the default route:
>
>   ip route add unreachable 192.168.4.0/24 proto static
>
> (I prefer unreachable, since it makes debugging marginally easier, but
> Real Men (and Real Women) use blackhole routes.  I know, I'm a wimp.)

I like chatty interior networks also.

One thing I added recently was bcp38 support, and it helps if that is
chatty. bcp38 is a package I've been meaning to push up from
ceropackages into openwrt...

I pointed out an issue with rogue routers announcing things internally
like 75.75.75.75 which this package somewhat helps with also.

> Then export this route as usual (babeld doesn't care that it's unreachable
> -- as far as it's concerned, it's a perfectly good static route):
>
>   redistribute ip 0.0.0.0/0 le 24 allow
>
> No idea how to do it through UCI, you'll need to ask Gabriel.

Thank you, I'll try.

>> I turn it off (it has a noisy fan) and for sane values of "boom", the
>> whole network switches over to going through the wan or adhoc ports.
>
> You appear to be running with a very high hello interval, so the value of
> boom is on the order of a minute, right?

It usually seems much faster than that... but I'll go measure. I have
a couple links I'd like to fail over faster than they do.

I am using the default hello intervals. Should I tighten that in this case?

Several 802.11ac wifi interfaces in the lab are bridged as well, and I
actually want them preferred so I don't tell babel they are actually
wifi. (part of the reason for the edgerouter upgrade was so I could
drive those 802.11ac devices at faster rates, and they are the only
thing here not running babel. yet. )

multicast throughout the network is set to 9mbits/sec. The 172.20.142
devices are all nanostation M5s with p2p links (this gives me some
desire for
wanting unicast route updates one day) , the aps are all on
picostations with a routed /24 each (143 for p2p), most channels are
unique.

I typically configure everything to not get default routes via dhcp, in openwrt
that's option 'defaultroute' '0', and in dhclient.conf on things like
debian, you just kill the "routers" portion of the setup:

request subnet-mask, broadcast-address, time-offset, routers,
        domain-name, domain-name-servers, domain-search, host-name,
        dhcp6.name-servers, dhcp6.domain-search,
        netbios-name-servers, netbios-scope, interface-mtu,
        rfc3442-classless-static-routes, ntp-servers,
        dhcp6.fqdn, dhcp6.sntp-servers;

default routes are evil.

>> Compared to what would have happened if I'd tried vlans or some other
>> bridging solution, this was marvelous.
>
> Isn't your use case exactly what STP was designed for?  Set the STP root
> to the edgerouter, and put an alternate root with slightly lower priority
> on your other router.

I guess I should do a network map, huh? If I were to bridge the entire
network (20+ wireless mesh connections spread over 110 acres, 30+
wired in various locations, mostly in the lab, 100+ users on the APs
on the weekends) bad things would happen.

As for bridging the lab, that is doable, but I'm trying to prototype
the next generation of the deployment and it's just easier to route
everything.

> (Unlike Babel, though, STP won't handle the meshy
> part of your network, and it won't attempt to optimise your traffic --
> everything will follow the STP tree, even when shortcuts are possible.

One of the weird ways I use babel is to be able to test a given device
through a given path, which I typically do by disabling babel on the
interface, downing a given interface, or doing filtering.

I can still shoot myself in the foot, however, as babel tries really
hard to find a path no matter what. More times than I can count I have
ended up testing a different path than what I thought I was testing.

 And: if I'm saturating the network, or using an artificial bottleneck
or wifi sometimes it does very briefly (far less than 4 sec) lose a
route through that during a rrul test. I have long been tempted to
change the current delete/add logic for add/delete for route updates
now that I have some trust in modern kernels.

(I do generally run with the non-juliusz-approved ecn patch, which
doesn't make any difference, my current theory as to why I lose a
route is interference or being out of cpu. Getting unicast updates
would fix that, but then we need a new metric, and running out of cpu
points to the window between delete/update being significant in some
circumstances)

I do have some packet captures of when this happens now. Can get it to
happen about every 20 minutes or so at the moment on the devices I'm
testing through.

> Unlike TRILL, however, it's a simple, well-designed, clean protocol that
> works beautifully in the particular class of topologies it was designed for.)
>
> -- Juliusz



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article



More information about the Babel-users mailing list