[Babel-users] RTT-aware branch of Babel

Dave Taht dave.taht at gmail.com
Fri Jun 21 16:31:41 UTC 2013


On Fri, Jun 21, 2013 at 6:42 AM, Baptiste Jonglez
<baptiste.jonglez at ens-lyon.fr> wrote:
> On Thu, Jun 20, 2013 at 03:40:00PM -0700, Dave Taht wrote:
>> So it sounds like distinguishing between gigE and 100Mbit won't happen
>> with a floor like this until the 100Mbit link gets loaded. How
>> unstable does it get with a rtt-min of 1 (or less) ? (100Mbit ethernet
>> is common with POE, GigE is almost everywhere else)
>
> This is certainly not a realistic way to distinguish between 100M and
> 1G links.

Well, link speed is easily detected by the local daemon, but what I've
been looking for was a means of mildly preferring higher speed
ethernet links over slower ones - perhaps reducing the ethernet metric
from the default of 96 by X per link tier? (94 for GigE? 36?)

as for distinguishing dynamically by measuring RTT, certainly
scheduling and queuing delay (based on load) matter FAR more, but a
baseline latency can be obtained over time...

The baseline latency of a ping-pong netperf UDP_RR test, which is a
single 1 byte packet and response on an unloaded link, changing the
link speed via ethtool -s advertise 0x008

(netperf -H enki -t UDP_RR)

1Gig: 6666.00 transactions/sec
100Mbit: 1999 transactions/sec

ping
1Gig: 64 bytes from 172.20.39.2: icmp_req=2 ttl=64 time=0.135 ms
100Mbit: 64 bytes from 172.20.39.2: icmp_req=2 ttl=62 time=1.40 ms

(there are kernel optimizations for ping however that piggyback on
other packets being sent to the driver)

> For a start, Babel packets are extremely small, let's say around 100
> bytes.  The transmission time would be 8 μs at 100M, and less than 1
> μs at 1G.  At this scale, the time spent in queues is all but
> negligible,

? The whole point of the bufferbloat effort is that the time spent in
queues under load can be quite large. On 10Mbit links with the default
PFIFO_FAST limit 1000, they can be measured in seconds on ethernet.

> and even the physical propagation time can be in the same
> order of magnitude.

Queues really, really, really matter.

http://www.bufferbloat.net/projects/codel/wiki/RRUL_Rogues_Gallery

I am curious if you were using the latest openwrt in your testing, and
if you tested under heavy loads? The rrul test is good on generating
those. So is bittorrent.

Barrier breaker now uses fq_codel by default. The results are
generally pleasing but might be misleading if you generalized from
that behavior.

I will try to put up some current gigE, 100Mbit and 10Mbit results
from current kernels comparing pfifo_fast, fq_codel, codel, and pie,
over the next month or so. My current work is validating a few gigE
boxes (atoms) and things like the beagleboard and raspberri pi in
addition to cerowrt...

Some results from yesterday on the atoms at GigE speeds:

The only reason why the pfifo_fast result is this good...

http://snapon.lab.bufferbloat.net/~d/Native_GigE_Atoms_NoOffloads-5873/Native_GigE_Atoms_NoOffloads-pfifo_fast-all_scaled-rrul.svg

Is that this is a p2p test and "tcp small queues" is in effect. And
it's enough to begin to be over your threshold.

The behavior of codel by itself...

http://snapon.lab.bufferbloat.net/~d/Native_GigE_Atoms_NoOffloads-5873/Native_GigE_Atoms_NoOffloads-codel-all_scaled-rrul.svg

While fq_codel stays relatively flat, it still jumps quite a lot over
the baseline.

http://snapon.lab.bufferbloat.net/~d/Native_GigE_Atoms_NoOffloads-5873/Native_GigE_Atoms_NoOffloads-fq_codel_quantum_300-all_scaled-rrul.svg

I put up some slides as to the bad behavior of 802.11e under load on
the ath9k hardware at battlemesh, as well.

> Also, the size of a Babel message can range from about 20 bytes
> (single Hello message) to something like 400 bytes (full route table
> dump), which would add way too much noise.

Not sure what your point is here. I thought it was just the hellos you
timestamped. noise over time can be reduced...

The latter is a function of the size of the route table... I just did
a quick capture of my lab network, and the biggest babel packet was a
pleasingly sized 1459 byte packet (tho it looks like it is two
packets, second packet of 169 bytes, my dissector on this box can't
tell the difference)

68 ipv4 routes, presently 4 ipv6 routes.

>
>
> While making this distinction would be interesting, our method is
> well-suited for resolutions of about 1 ms (through averaging, the real
> resolution for samples is 10 ms) to hundreds of ms.

See above.

>
> Regards,
> Baptiste
>
> _______________________________________________
> Babel-users mailing list
> Babel-users at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users



-- 
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html



More information about the Babel-users mailing list