[Babel-users] Socket to control babeld, including a command to prioritize neighbours
Julien Muchembled
jm at jmuchemb.eu
Thu Oct 9 17:20:50 UTC 2014
Hello,
re6stnet creates an ipv6 resilient network, using OpenVPN to link nodes that are not in the same LAN, and babeld for routing:
http://git.erp5.org/gitweb/re6stnet.git
re6stnet limits the number of OpenVPN tunnels so we don't have a full-mesh and from time to time, it deletes one to create a new one to a random node. However, since the beginning (mid-2012), we have an issue with the destruction of tunnels because we couldn't make sure there were no route through the tunnel being destroyed.
I didn't find any reliable way to fix this without modifying babeld, so here are 2 patches for review:
http://git.erp5.org/gitweb/babeld.git/shortlog/refs/heads/ctl
commits d37e373 & bd1cf65
The first one (d37e373) implements a new way to communicate with babeld.
- It is done via a unix socket for security, but uses network byte ordering so it can easily be transformed into TCP with socat.
- It is fully asynchronous i.e. the processing of a packet does not do any IO. No hang or data lost if either side is slow.
- Binary protocol for reliability and simplicity. I didn't want to format and parse strings with regex.
The protocol uses simple TLV packets like between babeld nodes. L is 4 bytes though to possibility contain big amounts of data, which happens easily when dumping the routing table.
This first commit implements 1 command, to get the same information as dump_tables()
The request packet contains a few parameters to specify which information is wanted.
In fact, I even think this new interface should replace local.c
The second commit (bd1cf65) implements a second packet to alter the result of neighbour_cost: its returned value is multiplied by a new 'cost_multiplier' field in struct neighbour. cost_multiplier=0 forces neighbour_cost to return INFINITY
For the moment, we only use it to avoid routing via a specified neighbour (the other side of the tunnel to destroy). Algorithm is:
1. Node C (as client) decides to destroy a openvpn client tunnel that is connected to node S
2. C sends a packet to its babeld to increase cost to S
3. C requests dumps until no route go via S
4. C sends a second packet to its babeld to set cost_multiplier=0 for S, to make sure no route comes back. Note that the processing of such packet is atomic: the packet is ignored if there are still installed routes via S.
5. If there's still no route via S and if cost_multiplier coud be set to 0, C tells S to do the same thing on its side.
6. Same as steps 2,3,4 for S
7. If there's still no route via C and if cost_multiplier coud be set to 0, S replies to C that the tunnel can be deleted.
8. C deletes the tunnel
At any point the whole process is aborted if a step fails or takes too long. There's no point insisting: we can try deleting another tunnel.
Whether the tunnel could be destroyed or not, the process is ended as follows:
1. Wait some time.
2. Restore original cost_multiplier.
In the future, we may use this new packet to define different classes of nodes. RTT-based metric is great but not always enough. A node with low latency may be unreliable (for example crashing all the time). So we consider having a set of core trustworthy nodes.
In any case, neighbour costs could then vary a lot and I was a little annoyed by their small precision (2 bytes). Because most of the time, values starts at 96 or 256, you can see that the result of neighbour_cost is also divided by 256 (and the default value for cost_multiplier is 256). In other words, the cost_multiplier field codes a value between 1/256 and ~256 (+ inf).
re6stnet includes a demo using network namespaces. It simulates 9 nodes in a somewhat accelerated mode. Hello interval is 4 seconds. Each node creates at most 2 client tunnels and tries to delete a tunnel every 100 seconds.
We finally have code that does not lose a single packet after several days.
However, there's still a limitation. We are not always able to identify a neighbour. This happens when the direct route to a neighbour is not the best route (yes, we found such cases in China, or between China and Japon), and without keep-unfeasible, there's not always a route to the neighbour with refmetric=0 from which we can take the neigh address/ifindex.
In re6st, a node is only identified by the prefix it exports. The only place we use link-local IPv6 is for the SET_COST_MULTIPLIER packet and we get the information from babeld dumps. It would be quite heavy to get the address/ifindex by other means.
On the other side, it looks trivial and efficient to solve this in babeld, by adding an 'id' field in neighbour, and use the id instead of address/ifindex in the SET_COST_MULTIPLIER packet.
Regards,
Julien
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/babel-users/attachments/20141009/5212878f/attachment.sig>
More information about the Babel-users
mailing list