[Babel-users] Optimised xroute updates

Fri Oct 26 19:13:52 BST 2018

This was by no means a test of the xroute import and export scope I
was doing before... or of "correctness"...

But cpu-wise it looks very good on a 5 minute test with 16,000 rtod
routes inserted, on unbuntu 18 with gcc-7. Basically
this test was: rtod -r 16000; babeld -D; sleep 300; killall babeld

gprof:

* This is the before:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 53.27     33.44    33.44  1435659     0.02     0.02  find_xroute
 45.72     62.14    28.70                             check_xroutes
  0.19     62.26     0.12  1738669     0.00     0.00  find_route_slot
  0.11     62.33     0.07  1090297     0.00     0.00  normalize_prefix
  0.11     62.40     0.07      776     0.09     0.12  netlink_read
  0.10     62.47     0.07                             compare_buffered_updates
  0.06     62.51     0.04  1090297     0.00     0.00  network_prefix
  0.05     62.54     0.03  2636135     0.00     0.00  do_filter
  0.05     62.57     0.03                             parse_packet
  0.03     62.59     0.02  1735845     0.00     0.00  roughly
  0.03     62.61     0.02  1702862     0.00     0.00  timeval_minus_msec
  0.03     62.63     0.02   848988     0.00     0.01  buffer_update
  0.03     62.65     0.02   842762     0.00     0.00  really_buffer_update
  0.03     62.67     0.02   752705     0.00     0.00  filter_route
  0.03     62.69     0.02       97     0.21   204.72  flushupdates
  0.02     62.70     0.02      636     0.02     0.02  accumulate_int
  0.02     62.71     0.01  1868952     0.00     0.00  martian_prefix
  0.02     62.72     0.01   851847     0.00     0.00  start_message

* Here is the "after":

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 13.00      0.13     0.13                             compare_buffered_updates
 12.00      0.25     0.12   937310     0.00     0.00  find_xroute_slot
 11.00      0.36     0.11  1656430     0.00     0.00  normalize_prefix
 10.00      0.46     0.10                             parse_packet
  7.00      0.53     0.07     1528     0.05     0.06  netlink_read
  6.00      0.59     0.06 13080298     0.00     0.00  xroute_compare
  5.00      0.64     0.05   903495     0.00     0.00  really_buffer_update
  4.00      0.68     0.04  1863732     0.00     0.00  roughly
  3.00      0.71     0.03    67448     0.00     0.00  find_source
  3.00      0.74     0.03  1920065     0.00     0.00  find_route_slot
  3.00      0.77     0.03  1852599     0.00     0.00  timeval_minus_msec
  3.00      0.80     0.03  1656430     0.00     0.00  network_prefix
  2.00      0.82     0.02  1089020     0.00     0.00  filter_route
  2.00      0.84     0.02   909654     0.00     0.00  buffer_update
  2.00      0.86     0.02      169     0.12     2.06  flushupdates
  2.00      0.88     0.02                             kernel_route_compare
  1.00      0.89     0.01  3597870     0.00     0.00  v4mapped
  1.00      0.90     0.01  1904274     0.00     0.00  do_filter

I do note that 16000 routes completely "does in" all the other routers
on the network (I didn't patch in my "late hello" patch,
but that's usually the first sign). The "before"  *also* essentially
makes the injected router drop off the net because it's too
busy dealing with stuff. I imagine this causes retractions and other
fun side-effects.

the "after "one, also falls off the net to some extent... packet loss?
a problem with the patch? just flat out running out of cpu again?

 a view from another gw:

root at gw1:~# ip -6 route | grep unreach | wc -l
5524
root at gw1:~# ip -6 route | wc -l
16025

Not having the injected router drop off the net seems to make the
other routers eat less cpu... and stay "more alive", but that's a very
subjective observation. I have some tools to monitor this behavior
coming up (long flent tests) new version... like I said, subjectively,
my route tables stayed pretty stable, the ipv6 routes were distributed
mostly successfully,
and nothing crashed. (mostly running my soon to be obsolete xnor branch tho)

THANK you. I'm very tempted now to shoot for 64k routes, late at
night, when nobody will notice. :P

* Lastly, for giggles, I compiled the new code with -O3 -msse4.2, gcc 7

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 20.69      0.12     0.12                             parse_packet
 18.97      0.23     0.11   845183     0.00     0.00  find_xroute
  8.62      0.28     0.05   838830     0.00     0.00  really_buffer_update
  6.90      0.32     0.04                             compare_buffered_updates
  5.17      0.35     0.03  1877721     0.00     0.00  normalize_prefix
  5.17      0.38     0.03  1708480     0.00     0.00  find_installed_route
  5.17      0.41     0.03      250     0.12     0.94  flushupdates
  4.31      0.44     0.03  1762446     0.00     0.00  update_route
  3.45      0.46     0.02  1734377     0.00     0.00  roughly
  3.45      0.48     0.02     1178     0.02     0.02  netlink_read.constprop.4
  3.45      0.50     0.02       47     0.43     1.54  send_self_update
  2.59      0.51     0.02                             kernel_route_compare
  1.72      0.52     0.01  1720798     0.00     0.00  timeval_minus_msec
  1.72      0.53     0.01   848113     0.00     0.00  buffer_update
  1.72      0.54     0.01   111293     0.00     0.00  send_update
  1.72      0.55     0.01    32162     0.00     0.00  find_source
  1.72      0.56     0.01     4463     0.00     0.01  handle_request
  1.72      0.57     0.01                             babel_recv
  0.86      0.58     0.01  1169095     0.00     0.00  filter_route
  0.86      0.58     0.01      196     0.03     0.03  flush_interface_routes