[Babel-users] Sroamd: Protocol Documentation?

Thu Jul 22 00:55:47 BST 2021

> Okay, I found this design description that is very helpful. :)
> https://www.mail-archive.com/babel-users@alioth-lists.debian.net/msg00662.html

Some more details.  Sroamd consists of five pieces:

1. a distributed (replicated) database (flood.c);
2. a table that maps IP addresses to MAC addresses;
3. a table that maps MAC addresses to associated routers;
4. a way to get packets from the network to the right router
5. a way to get packets from the mobile to the right router

1. A distributed database

This is implemented in flood.c.  The database is not reliable (that would
be impossible to do in a distributed manner without a global clock), but
it reliably detects conflicts: if the network gets partitioned, then after
the network is merged again, nodes will realise that there is a conflict,
which the client code will then need to resolve.

Note that using a distributed database is a deliberate choice: the code
could easily be replaced with an ordinary central database node, for
example a RADIUS or DIAMETER server.

2. A table IP -> MAC

This is implemented in lease.c, dhcpv4.c and ra.c.  Sroamd listens to
DHCPv4 and RD requests, and maintains a map

  IPv4 address    -> MAC
  IPv6 /64 prefix -> MAC

For IPv4, we just implement a basic DHCPv4 server (dhcpv4.c).  For IPv6,
we allocate a different /64 to each mobile node by sending a RA over
unicast; it's wasteful, but it's the only way to get IPv6 to work with
Android.  (We could in principle implement stateful DHCPv6, but since it
doesn't work with Android, it seems somewhat pointless to me; let me know
if you disagree.)

There can be only one MAC for an IPv4, but there can be multiple IPv4 for
a single MAC, for example if the network gets partitioned and the node
gets different IPv4 addresses from different routers.

We try to avoid conflicts by flooding a 10s lease before we assign
a permanent one, so that conflicts will typically last just 10s.  There is
a theoretical conflict if the network gets partitioned, in which case we
retract the IPv6 address.  We don't do anything clever with IPv4 yet, in
this unlikely case the user will need to restart their DHCPv4 client (by
disabling and reenabling WiFi, in the case of Android).

3. A table MAC -> associated router

This is implemented in netlink.c and client.c.

We listen no netlink messages, and whenever we notice that a node has
associated to us, we flood a new entry in the distributed database.
Again, conflicts are unavoidable (if a node moves A->B->C in quick
succession, the flooding algorithm might flood C before B, so some nodes
think the node is at B and some at C).  When we detect a conflict, we ping
the node from both routers -- unless the node is associated to both, only
one of the pings should yield a positive result.

4. a way to get packets from the network to the right router

This is implemented in client.c.

This one is easy: the router that believes that the node is associated to
it announces a Babel route to the mobile node.  There might be multiple
routes temporarily after a node has moved, but all but one will disappear
as soon as the conflict detection has converged (see point 3 above).

5. a way to get packets from the mobile node to the right router

This is the tricky bit.  The solution to assign the same IP address to all
edge routers.  When a node associates to a router, the router sends
a gratuitious ARP/ND to the mobile node, which causes its neighbour cache
to be reset to the MAC address of the router.

-- Juliusz