[Babel-users] us vs ns resolution in babel rtt branch

Thu Mar 19 03:33:04 UTC 2015

short note... I have a longer one but it's late, and I figure jesper's
email to babel-users bounced...

On Wed, Mar 18, 2015 at 8:12 PM, Jesper Dangaard Brouer
<jbrouer at redhat.com> wrote:
>
> On Wed, 18 Mar 2015 13:46:34 -0700 Dave Taht <dave.taht at gmail.com> wrote:
>
>> There have been many astounding improvements in latency in the linux stack
>> of late, jesper (cc'd) has been doing tons of work in this area,
>> another is in FIB tables, and there are others
>> like low latency udp short circuits that I am not tracking well.
>>
>> Jesper, what sort of numbers would you get along the lines' of
>> baptiste's benchmark  at 10GigE these days?
>
> I do a lot testing where I try to isolate different parts of the kernel
> network stack, in an attempt to reduce latency in different areas.
> These tests don't represent what can be achieved full-stack usage-cases.
>
> For more realistic workload, where more of the stack is exercised,
> I did some single core IP-forwarding (before the FIB improvements),
> which were around 1Mpps (million packets per sec).
>
> 1Mpps translates into 1000ns latency
>  * 1/1000000*10^9 = 1000 ns
>
> When using bridging (avoiding FIB lookup), and tuning the stack as much
> as possible, e.g. compiling out the netfilter bridge hooks.  I could
> reach approx 2Mpps (bridge forwarding, single CPU core).
>
> 2Mpps transltates into 500ns latency
>  * (1/2000000)*10^9 = 500 ns
>
>
>> On Wed, Mar 18, 2015 at 6:28 AM, Baptiste Jonglez <
>> baptiste at bitsofnetworks.org> wrote:
>>
>> > On Sun, Mar 15, 2015 at 03:00:12PM -0700, Dave Taht wrote:
>> >
>> > > I still kind of like getting down to ns resolution here. A single
>> > > 64 byte packet at 10gigE takes like 64ns. I have discussed
>> > > elsewhere my hopes for using babel messages to determine the
>> > > capacity and utilization of a given network interface before,
>> > > depending on fq_codel's behavior as a substrate....
>> >
>> > I still believe other factors dwarf these 64 ns...
>
> It is 67.2ns for 10G smallest packet size (64bytes -> 84bytes due to
> Ethernet overhead).
>
> Calc: ((64+20)*8)/((10000*10^6))*10^9 = 67.2ns
> See:
>   [1]http://netoptimizer.blogspot.co.nz/2014/05/the-calculations-10gbits-wirespeed.html
> and
>   [2] http://netoptimizer.blogspot.co.nz/2014/09/packet-per-sec-measurements-for.html
>
>
>> > I had done some measurements with the µs-precision code, on a direct
>> > gigabit link between two hosts (no switch):
>> >
>> >
>> > http://files.polyno.me/babel/evalperf/figures/32bits-rtt-ethernet-thinkbad-gilead.svg
>> >
>
> Interesting!
>
>> > As you notice, ping6 reports 400 µs, babeld reports 800 µs, while
>> > the theoretical latency is 512 ns for a 64-bytes packet and 12 µs
>> > for a full 1500-bytes packet.  I am neglecting propagation delays
>> > (that would amount to about 50 ns for a 10-meter cable).

Well, it seems likely any of several things are happening.

1) In many nic's ping (and other sorts of packets) is de-optimized via
one of the -C options to ethtool, and/or via napi. Without load,
slowed all the more.

2) when not under load I imagine userspace->kernel->device driver
transfers are potentially slower (due to cache misses)

3) your tests predate the xmit_more bulk transmit patches

4) your rx timestamp is acquired at the userspace level, not via one
of the SO_TIMESTAMP options at the device dequeue level

5) ipv6 multicast and/or unicast path is unoptimized

6) your hardware is weak.

>> > So, the actual latency (measured either with ping or babel) is 3
>> > orders of magnitude higher than the theoretical latency.  Do note
>> > that my tests were done with low-end hardware, so it might be
>> > better with high-performance NICs.
>> >
>> > Do you have latency measurements on 10G links?  If so, what tool do
>> > you use?  I believe both ping and babel are not very accurate at
>> > these timescales.
>
> Wrote some blogpost about the measuring I'm doing [2] and [3].  I'm
> basically deriving the latency from the PPS measurements (which
> basically the average latency, and it also requires single core
> isolation).
>
> Network setup for accurate nanosec measurements:
>  [3] http://netoptimizer.blogspot.dk/2014/09/network-setup-for-accurate-nanosec.html
>
> Thanks for the pitch Dave ;-)

I applaud from across the water, always.

> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Sr. Network Kernel Developer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer

-- 
Dave Täht
Let's make wifi fast, less jittery and reliable again!

https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb