[Babel-users] Babel MAC auth fails due to packet reordering

Toke Høiland-Jørgensen toke at toke.dk
Wed May 4 12:02:02 BST 2022


Juliusz Chroboczek <jch at irif.fr> writes:

>> I'm attaching a filtered down pcap from the receiving side showing the
>> problem. Wireshark filter: (babel && ipv6.src == fe80::1) || icmpv6. The
>> sending side is running babeld the receiving side bird2 if that matters.
>
> Interesting, thanks.
>
> 1. Are you running with the "unicast" option set in your config file?
>
> 2. Is the link badly congested?  Your dump indicates that a packet got
>    delayed by no less than 200ms (!).
>
> Dave, Toke, please have a look at packets #10 and #11 in Daniel's dump,
> and let me know if you're as puzzled as I am.

Hmm, okay, so according to that dump it's the unicast packet overtaking
the multicast, not the other way around, as your initial message said;
is this accurate?

Daniel, you say you noticed this when you "turned on fq_codel", but also
that this is only happening over wireless links. Do you mean that you
explicitly enabled fq_codel on the WiFi links (as opposed to using the
built-in FQ-CoDel implementation in the WIFi stack), or is there an
Ethernet hop involved? And how are the interface(s) in question
configured (station/AP/mesh?) and which WiFi driver is used?

Also, you mention the other side is running bird; does the reordering
only happen with babeld as the sender?

Working from the assumption that this is happening when babeld is
running directly on a wireless link, it could have something to do with
station vs multicast scheduling. Specifically, there's a separate
per-interface multicast queue (but only one), and each station has four
queues, once for each AC value. Flow queueing is then applied within
each queue, so I doubt FQ is a direct cause of this.

One explanation could be simple DSCP values: babeld marks packets as
CS6, which should put them in the VO 802.11 queue, which will give it
both scheduling priority on the host, and tighter back-off parameters
allowing it to get an airtime slot sooner. However, the Linux stack
doesn't do 802.11 AC prioritisation for multicast at all, so this only
works for unicast packets. This could explain the reordering: if the
unicast packet is sent right after the multicast packet, it could
overtake it in the queues. I agree with Juliusz that 200ms seems a bit
on the high side for such a delay, but if the channel is suffering a lot
of congestion (not necessarily from the same station), I suppose it
*could* take that long for the scheduling to get around to servicing the
multicast queue *and* getting and airtime slot (multicast also runs at a
very low bit level, so the packet will take up more airtime, which will
make it more prone to interference and thus retransmissions, causing
further delay).

Of course the diffserv hypothesis is quite easy to test: just disable
diffserv and see if the problem goes away. I don't think babeld has a
configuration knob for this, but you could clear it with an iptables
rule like:

ip6tables -t mangle -A OUTPUT -o wlan0 -p udp --dport 6696 -j DSCP --set-dscp 0

-Toke



More information about the Babel-users mailing list