[Babel-users] Babel MAC auth fails due to packet reordering

Daniel Gröber dxld at darkboxed.org
Wed May 4 12:37:06 BST 2022


Hi Toke,

On Wed, May 04, 2022 at 01:02:02PM +0200, Toke Høiland-Jørgensen wrote:
> Daniel, you say you noticed this when you "turned on fq_codel", but also
> that this is only happening over wireless links. Do you mean that you
> explicitly enabled fq_codel on the WiFi links (as opposed to using the
> built-in FQ-CoDel implementation in the WIFi stack), or is there an
> Ethernet hop involved? And how are the interface(s) in question
> configured (station/AP/mesh?) and which WiFi driver is used?

Right, so my router is a separate box, attached to an OpenWrt AP
(ubnt,unifiac-pro) via a switch. So there is an ethernet hop involved. The
AP is in normal infrastructure mode.

What I mean by fq_codel enablement is just setting `sysctl
net.core.default_qdisc=fq_codel`
on the router. The AP already had this set by default.

Given that the problem doesn't happen over ethernet (see below) that's
probably a red herring thogh.

> Also, you mention the other side is running bird; does the reordering
> only happen with babeld as the sender?

babeld doesn't log auth failures AFAICT but the neighbour cost stays at
infinity there too so I assume it's having the same problem. I'm happy to
debug that too but I think we should take it one problem at a time :)

> I agree with Juliusz that 200ms seems a bit on the high side for such a
> delay, but if the channel is suffering a lot of congestion (not
> necessarily from the same station), I suppose it *could* take that long
> for the scheduling to get around to servicing the multicast queue *and*
> getting and airtime slot (multicast also runs at a very low bit level, so
> the packet will take up more airtime, which will make it more prone to
> interference and thus retransmissions, causing further delay).

Where are you getting the 200ms number from exactly? I don't think my link
is congested certainly nothing was going on in my network at the time and
none of the neigbours around here are using the 5GHz band so I'd be
surprised if that was it.

I did just try it over ethernet and that isn't showing the same problem,
hmm. So the wireless link is somehow to blame still.

> Of course the diffserv hypothesis is quite easy to test: just disable
> diffserv and see if the problem goes away. I don't think babeld has a
> configuration knob for this, but you could clear it with an iptables
> rule like:
> 
> ip6tables -t mangle -A OUTPUT -o wlan0 -p udp --dport 6696 -j DSCP --set-dscp 0

I tried that with the output device adjusted to my ethernet device, didn't
change anything.

--Daniel



More information about the Babel-users mailing list