[Babel-users] How to speedup converge and how to set preferred source

Lifan Su thssld at gmail.com
Mon Dec 24 19:43:41 GMT 2018


Hello,

I am trying to use tinc + babel to build a mesh network. Currently I built a
simple loop between 3 nodes and try to disconnect one of them. Now I have two
problems and am looking for help.

System configuration:

3 virtual machines in a same network (192.168.174.96,100,104/24, not shown),
created tunnels using tinc. Each VM contains a lxc host, using linux bridge
and veth to communicate.


Topology:

172.31.224.10/24 (lxc host)                       172.31.225.10/24 (lxc host)
    |                                                                  |
  linux-bridge                                                    linux-bridge
    |                                                                  |
172.31.224.1/24                                              172.31.225.1/24
  test_1 192.168.227.9/30 <----- test_1_2 -----> 192.168.227.10/30 test_2
192.168.227.18/30                (tunnel)                 192.168.227.13/30
    |                                                                  |
 test_3_1 ---> 192.168.227.17/30  test_3  192.168.227.14/30 <--- test_2_3
 (tunnel)                   172.31.226.1/24                       (tunnel)
                                   |
                             linux bridge
                                   |
                            172.31.226.10/24 (lxc host)


Babeld version: 1.8.2
Used configuration for router test_1, similar for other nodes

debug 1
interface test_1_2 type tunnel link-quality true
interface test_3_1 type tunnel link-quality true
redistribute ip 172.31.224.0/24 metric 64
redistribute local deny

Problem 1:
Currently when the route converges, the lxc hosts can ping each other, and the
lxc hosts can ping the routers (test_{1,2,3}), but when I disconnect a link,
e.g. test_3_1, connection interruptted between test_1 and test_3 for ~250 sec.
The log shows that test_1 stucks as follows for most of the period:

My id 44:05:c1:70:07:7f:1f:69 seqno 328
Neighbour fe80::6c5a:b6ff:fe73:9458 dev test_1_2 reach ffff ureach
0000 rxcost 96 txcost 96 rtt 0.521 rttcost 0 chan -2.
172.31.224.0/24 metric 64 (exported)
172.31.225.0/24 from 0.0.0.0/0 metric 160 (165) refmetric 64 id
dc:c7:bb:33:bc:db:aa:25 seqno 41970 age 0 via test_1_2 neigh
fe80::6c5a:b6ff:fe73:9458 nexthop 192.168.227.10 (installed)
172.31.226.0/24 from 0.0.0.0/0 metric 256 (261) refmetric 160 id
28:12:e7:16:18:1e:09:f5 seqno 28186 age 0 via test_1_2 neigh
fe80::6c5a:b6ff:fe73:9458 nexthop 192.168.227.10

After the route for 172.31.226.0/24 becomes feasible, it quickly installed and
connectivity restored.

So is there any options to make this process faster? I have tried
to lower hello-interval, but it just shorten this to around 200 secs.

Problem 2:
With current configuration, I noticed that the routers (test_{1,2,3}) cannot
ping undirectly connected nodes, tcpdump shows the ping message has sender of
192.168.227.*, which is not populated to babel.

I want to use 172.31.* to simplify management, so I don't want to populate
192.168.224.* if possible.

I noticed that the kernel route table installed by babel doesn't contain
'src <ip>' part. When I manually create routes with such part, all the
routers and nodes can ping each other.

So is there any options to pass the preferred source to kernel route table?
I noticed https://alioth-lists.debian.net/pipermail/babel-users/2012-September/001086.html
is not merged into current code, is there any alternative, config file based is
preferred.

I am not native English speaker nor a network specialist, so I don't well
understood the manual, if I missed anything in manual, I will appreciate
if you explain it further.



Lifan Su



More information about the Babel-users mailing list