[Babel-users] [babel] Accuracy of timestamps for babel-rtt: userspace vs kernel timestamps

Dave Taht dave.taht at gmail.com
Thu Jul 25 21:28:44 BST 2019


Thx very much for taking a look at this.

Another interesting test is to put a workload on the box (flent's rrul
test is what I use, but a couple netperf or iperf's in both directions
is sufficient) and see what happens with fifo and fq_codel. Now that
you got my (crappy) first try working...

The measurements you get are about what I got using different methods
ages ago - that we cannot trust
a kernel to userspace transition, on bare x86 metal - to much below
250us. Containers/vms are worse,
you can do mildly better with a R/T kernel, and I expect mips to be
abysmal. But I can go try that
to see what happens.  Arm (particularly multicore arm), I have no
idea, the context switch overhead
pre-speculation arm chips was demonstrably lower than x86, and context
switch overhead got much worse
on everything post the spectre CVEs.

There's another kernel setsockopt nowadays that might be useful to set
a pacing rate, so far as I
recall that got made to work with udp around 4.12 in support of quic and bbr.

Regardless, I don't think nanosec resolution is needed, but I still
think the usec resolution could be useful
on short-rtt metrics, partially as a measurement of congestive or cpu overload.

A full size packet is 13ms at 1mbit, 13us at a gbit to transit the
link. Wifi is 700us to grab the media, and we typically have two txops
of up to 5.3ms in size stacked up. So some differentiation as to
quality here is possible...

On Thu, Jul 25, 2019 at 9:52 AM Baptiste Jonglez
<baptiste.jonglez at imag.fr> wrote:
>
> Hello,
>
> A recent discussion with Dave convinced me to start looking at whether
> very short RTTs make any sense in Babel, and whether they could be used to
> infer link speed.  If only to settle theses questions for good.  Since the
> related subject of nanosecond-resolution timestamps was brought up by Toke
> at the IETF session yesterday, I made a quick test with kernel timestamps
> today.
>
> I'll talk about the implementation and shortcomings of using kernel
> timestamps below, but here are some rough timing results.  I just used a
> veth pair on my laptop (4.17 kernel) with a babeld on each side of the
> pair, and didn't do any serious statistics.
>
> - regular babeld: average measured RTT ~320 µs (quite variable)
>
> - babeld with kernel RX timestamps: average measured RTT ~120 µs (quite variable)
>
> - ping through the same veth pair (link-local IPv6, 1000 packets): average 105 µs, minimum 16 µs
>
> The observant reader will notice that the current resolution (1 µs) is
> more than enough in that case.  Also, using kernel timestamps improves
> accuracy by about 100 µs on each host, a somewhat significant improvement.
>
>
> Now, regarding the implementation of kernel timestamps in babeld, my test code is here:
>
>   https://github.com/jonglezb/babeld/commit/56756a8cbe9a0b8a168c78873dd77e48e5770278
>
> Thank you Dave for your first draft of this code, I borrowed a bit from it ;)
>
> As explained in the commit message, it suffers from a number of issues
> that would need some serious work before it's really usable:
>
> - kernel timestamps use the realtime clock, and it's not configurable.
>   This is really annoying because babeld uses the monotonic clock (for
>   good reasons).  Using kernel timestamps forces us to fall back to the
>   realtime clock elsewhere in babeld, and it would require some work to do
>   it cleanly.
>
> - kernel timestamps are only used for received packets.  The sending side
>   (timestamp in Hello) still uses userspace timestamps, with the ensuing
>   accuracy issue.  It does not seem possible to tell the kernel to embed a
>   timestamp at a specified location in a packet just before sending it,
>   unless maybe playing with eBPF.
>
>
> Takeaway: at least on Linux, I don't see a use-case for nanosecond
> resolution timestamps.  If somebody ever writes an implementation of Babel
> for specialized hardware and runs a datacenter with ultra-low-latency
> network equipments, it could possibly still make sense.
>
>
> --
> Baptiste Jonglez
> PhD student
> Univ. Grenoble Alpes <https://www.univ-grenoble-alpes.fr/>
> LIG lab <https://www.liglab.fr/>
> Drakkar team <http://drakkar.imag.fr/>  |  Polaris team at INRIA <https://team.inria.fr/polaris/>
> _______________________________________________
> babel mailing list
> babel at ietf.org
> https://www.ietf.org/mailman/listinfo/babel



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740



More information about the Babel-users mailing list