[Freedombox-discuss] Freedombox Mesh Network Simulator

Wed Jun 20 18:13:55 UTC 2012

> My wife and I are walking in mall with at least one other person every 
> 40 feet or so.  We decide to separate and go shopping.  I walk this way, 
> she walks that way.  Before long we're out of direct WiFi communication 
> range to each other.  Now, if enough people there in the mall had 
> WiFi-enabled devices running some mesh software, I'd like to be able to 
> stay in contact with my wife as she moves about in her random way, and 
> as they all move about in their random ways.

The One Laptop per Child project also wanted to satisfy this goal.
They wanted kids all over a village to be able to reach each other and
to reach the Internet via a gateway at their school.  They had the
advantage of designing and building both the hardware and software.
But they failed, partly due to system integration issues.

They were using a buggy implementation of 802.11 meshing that preceded
802.11s.  But they also used higher level user interface software,
which used multicast packets to find and communicate with other nearby
laptops.  The mesh software worked poorly with multicast.  Not only
did it send multicasts at the slowest speed (1 megabit), which took up
a lot of airtime, but the various nodes would repeat the multicasts to
make sure that every node had heard them.  This limited the size of
the network that they could scale to.  They did not discover this
unfortunate interaction until very late in the
hardware/software/firmware integration process (when the
multicast-based application sharing software started working).

Another major problem was that a mesh network is very hard to
reproduce.  If it does something unexpected or suboptimal, the
developers can't just teleport themselves to the part of the world
where that particular physical configuration of radio nodes, physical
antennas, software versions, and firmware versions exists.  In many
cases they can't even reach into the nodes of that network over the
Internet while the problem is happening, to debug it.  Many, many OLPC
mesh problems occurred in the field which could not be replicated in
the lab, which made them 10x or 100x harder to fix.  This meant that
buggy mesh network firmware and software didn't improve at the usual
rate (of the rest of their software).

The result was that despite a lot of work addressing bugs and
performance in the mesh firmware, they never got their automatic mesh
network working with more than a handful of XO laptops.  If you put 30
laptops in a classroom, they would burn up 100% of the radio bandwidth
(and chew up their batteries) merely with overhead packets ("Hi, i'm
here."  "Hi you, I'm me; have you heard about Joe and Alice over
there?" "In case you want to send a message to Joe, send it via me to
Alice; I can hear Alice just fine.").  There was no bandwidth left for
the users to actually communicate; connections would time out, nodes
would appear and disappear from the mesh, etc.  So OLPC stopped using
the mesh and recommended that each classroom install one or more
802.11 access points.  Which has worked ok.  They also switched to
support ad-hoc 802.11 without meshing, for automatically networking "a
few students sitting around under a tree", which also works ok.

There ARE some mesh networks that I hear are working on a larger
scale, such as B.A.T.M.A.N.  I suspect that the large scale meshes are
in largely static networks that are tuned by humans to work well (just
as the broader Internet's routing system is tuned by humans to work;
it's not automatic).  I do not know if other meshes support multicast
(or other portable ways for high level software to find what nodes are
on the network), nor whether they work in a network of mobile nodes
with limited battery life.  All I can report on is the one project I
was involved in (OLPC), in which their mesh implementation failed to
accomplish its goals, and was dropped from the next generation
hardware and software.

This is part of why I recommend using wired connections wherever
possible.  For FreedomBox to succeed, it needs to succeed at scale.  A
FreedomBox network that can't route packets for more than 500 nodes
worldwide wouldn't be worth building.  (Clue: this is why the Internet
exists today: it scaled up and kept working, while the proprietary
networks that preceded it didn't scale up to worldwide scale.)  In a
substantial network, your mesh and dynamic routing protocol could
require a few megabits of traffic at all times on each node, just
keeping track of everything.  Over a 100-megabit Ethernet that's just
2 or 3% of the bandwidth.  But over 802.11, that burns up most of the
available bandwidth.  Every connection you move off wireless onto a
wire makes more radio bandwidth available for the folks who truly
can't run a wire.

I have some ideas about how FreedomBox nodes could provide censorship
resistant networking via running an overlay IPv6 network using
geographically based addressing to simplify the routing problem.  I'll
dig those up and repost them in the next few days.  It isn't an
automatic mesh, though; it has explicit connections made by humans
among friendly nodes.  The part that's automatic is how the packets
flow after the humans make those connections.

	John

PS:  If you think a mesh protocol shouldn't use even a megabit/second
of continuous overhead, please design and build one that doesn't,
and that scales up and keeps working.  It's harder than it looks.