[Babel-users] How to speedup converge and how to set preferred source
Lifan Su
thssld at gmail.com
Mon Dec 24 19:43:41 GMT 2018
Hello,
I am trying to use tinc + babel to build a mesh network. Currently I built a
simple loop between 3 nodes and try to disconnect one of them. Now I have two
problems and am looking for help.
System configuration:
3 virtual machines in a same network (192.168.174.96,100,104/24, not shown),
created tunnels using tinc. Each VM contains a lxc host, using linux bridge
and veth to communicate.
Topology:
172.31.224.10/24 (lxc host) 172.31.225.10/24 (lxc host)
| |
linux-bridge linux-bridge
| |
172.31.224.1/24 172.31.225.1/24
test_1 192.168.227.9/30 <----- test_1_2 -----> 192.168.227.10/30 test_2
192.168.227.18/30 (tunnel) 192.168.227.13/30
| |
test_3_1 ---> 192.168.227.17/30 test_3 192.168.227.14/30 <--- test_2_3
(tunnel) 172.31.226.1/24 (tunnel)
|
linux bridge
|
172.31.226.10/24 (lxc host)
Babeld version: 1.8.2
Used configuration for router test_1, similar for other nodes
debug 1
interface test_1_2 type tunnel link-quality true
interface test_3_1 type tunnel link-quality true
redistribute ip 172.31.224.0/24 metric 64
redistribute local deny
Problem 1:
Currently when the route converges, the lxc hosts can ping each other, and the
lxc hosts can ping the routers (test_{1,2,3}), but when I disconnect a link,
e.g. test_3_1, connection interruptted between test_1 and test_3 for ~250 sec.
The log shows that test_1 stucks as follows for most of the period:
My id 44:05:c1:70:07:7f:1f:69 seqno 328
Neighbour fe80::6c5a:b6ff:fe73:9458 dev test_1_2 reach ffff ureach
0000 rxcost 96 txcost 96 rtt 0.521 rttcost 0 chan -2.
172.31.224.0/24 metric 64 (exported)
172.31.225.0/24 from 0.0.0.0/0 metric 160 (165) refmetric 64 id
dc:c7:bb:33:bc:db:aa:25 seqno 41970 age 0 via test_1_2 neigh
fe80::6c5a:b6ff:fe73:9458 nexthop 192.168.227.10 (installed)
172.31.226.0/24 from 0.0.0.0/0 metric 256 (261) refmetric 160 id
28:12:e7:16:18:1e:09:f5 seqno 28186 age 0 via test_1_2 neigh
fe80::6c5a:b6ff:fe73:9458 nexthop 192.168.227.10
After the route for 172.31.226.0/24 becomes feasible, it quickly installed and
connectivity restored.
So is there any options to make this process faster? I have tried
to lower hello-interval, but it just shorten this to around 200 secs.
Problem 2:
With current configuration, I noticed that the routers (test_{1,2,3}) cannot
ping undirectly connected nodes, tcpdump shows the ping message has sender of
192.168.227.*, which is not populated to babel.
I want to use 172.31.* to simplify management, so I don't want to populate
192.168.224.* if possible.
I noticed that the kernel route table installed by babel doesn't contain
'src <ip>' part. When I manually create routes with such part, all the
routers and nodes can ping each other.
So is there any options to pass the preferred source to kernel route table?
I noticed https://alioth-lists.debian.net/pipermail/babel-users/2012-September/001086.html
is not merged into current code, is there any alternative, config file based is
preferred.
I am not native English speaker nor a network specialist, so I don't well
understood the manual, if I missed anything in manual, I will appreciate
if you explain it further.
Lifan Su
More information about the Babel-users
mailing list