[Babel-users] Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios

Sat Dec 20 07:56:18 GMT 2025

Hi,

If your network consists of pretty static nodes (fixed routers on the
roof), you can tune your settings to update the routing less frequently.

Adding and removing node can take time (30 minutes) it does not need to be
instant.

Best,

--
Benjamin Henrion (zoobab)
Email: zoobab at gmail.com
Mobile: +32-484-566109
Web: http://www.zoobab.com
FFII.org Brussels
"In July 2005, after several failed attempts to legalise software patents
in Europe, the patent establishment changed its strategy. Instead of
explicitly seeking to sanction the patentability of software, they are now
seeking to create a central European patent court, which would establish
and enforce patentability rules in their favor, without any possibility of
correction by competing courts or democratically elected legislators."

Le ven. 19 déc. 2025, 18:18, Valent at MeshPoint <valent at meshpointone.com> a
écrit :

> Hi everyone,
> I'm working on a fair, reproducible benchmark methodology for comparing
> mesh routing protocols (Babel, BATMAN-adv, Yggdrasil, and others).
> Before
> running the full benchmark, I'd like to get feedback from the Babel
> community on the methodology.
> BACKGROUND
> ----------
> We're using meshnet-lab (https://github.com/mwarning/meshnet-lab) for
> testing, which creates virtual mesh networks using Linux network
> namespaces
> on a single host. This approach has limitations that we've documented,
> and
> I'd appreciate input on whether our methodology properly accounts for
> them.
> TEST ENVIRONMENT
> ----------------
>    Hardware: ThinkPad T14 laptop (12 cores, 16GB RAM)
>    Software: meshnet-lab with network namespaces
>    Protocols: babeld 1.13.x, batctl/batman-adv, yggdrasil 0.5.x
> INFRASTRUCTURE LIMITATIONS DISCOVERED
> -------------------------------------
> During development, we found significant limitations when testing larger
> networks:
> 1. Supernode/Hub Bottleneck
> When testing real Freifunk topologies (e.g., Bielefeld with 246 nodes),
> we discovered that star topologies cause test infrastructure failures,
> not protocol failures.
> The issue: If a topology has a supernode (hub) connected to 200+ other
> nodes, the meshnet-lab bridge for that hub receives ~60 hello
> packets/second
> from all neighbors. This causes:
>    - UDP packet loss at the bridge level
>    - Apparent "connectivity failures" that are actually infrastructure
> artifacts
>    - False negatives that make protocols look broken when they're not
> Our solution: Cap maximum node degree at 20 and avoid pure star
> topologies.
> 2. Scale Limitations
> We've validated that 100 nodes is a safe limit where:
>    - CPU stays under 80%
>    - Memory is not a bottleneck
>    - Results are reproducible (variance < 10%)
> For networks larger than ~250 nodes, single-host simulation becomes
> unreliable regardless of available RAM. The bottleneck is CPU context
> switching between namespaces and multicast flooding overhead.
> 3. 1000+ Node Networks
> We cannot reliably test 1000+ node networks with this methodology.
> Any attempt would produce infrastructure artifacts, not protocol
> measurements. For such scales, distributed testing across multiple
> physical hosts would be needed.
> PROPOSED TEST SUITE
> -------------------
> We've documented a methodology with:
> 6 Topologies:
>    T1: Grid 10x10 (100 nodes, max degree 4)
>    T2: Random mesh (100 nodes, max degree ~10)
>    T3: Clustered/federated (100 nodes, 4 clusters)
>    T4: Linear chain (50 nodes, diameter 49)
>    T5: Small-world Watts-Strogatz (100 nodes)
>    T6: Sampled real Freifunk (80 nodes, degree capped)
> 5 Validation Tests (before benchmarks):
>    V1: 3-node sanity check
>    V2: Scaling ladder (find breaking point)
>    V3: Consistency check (reproducibility)
>    V4: Resource monitoring
>    V5: Bridge port audit
> 8 Benchmark Scenarios:
>    S1: Steady-state convergence
>    S2: Node failure recovery
>    S3: Lossy link handling (tc netem)
>    S4: Mobility/roaming simulation
>    S5: Network partition and merge
>    S6: High churn (10% nodes cycling)
>    S7: Traffic under load (iperf3)
>    S8: Administrative complexity (subjective)
> QUESTIONS FOR THE COMMUNITY
> ---------------------------
> 1. Missing tests?
>     Are there scenarios important for Babel that we should add?
> 2. Unrealistic tests?
>     Should we skip any tests that don't make sense for real-world
> evaluation?
> 3. Babel-specific considerations?
>     Any configuration parameters or behaviors we should specifically
> measure?
> 4. Large-scale alternatives?
>     Does anyone have experience with distributed mesh testing across
>     multiple hosts? How do you handle the coordination and measurement?
> 5. Known limitations?
>     Are there known Babel behaviors at scale that we should document
> upfront?
> INITIAL RESULTS
> ---------------
> Our initial tests with babeld show:
>    Grid 100 nodes:       100% connectivity, ~14s convergence
>    Chain 50 nodes:       100% connectivity, ~5s convergence
>    Small-world 100 nodes: 100% connectivity, ~12s convergence
> These results validate that the test infrastructure works correctly
> for Babel at this scale.
> FULL METHODOLOGY DOCUMENT
> -------------------------
> The complete methodology document attached.
> I'd appreciate any feedback, suggestions, or concerns before we proceed
> with the full benchmark.
> Thanks,
> Valent.
>
>
> ------ Original Message ------
> From "Juliusz Chroboczek" <jch at irif.fr>
> To "Linus Lüssing" <linus.luessing at c0d3.blue>
> Cc "Valent Turkovic" <valent at meshpointone.com>;
> babel-users at alioth-lists.debian.net
> Date 19.12.2025. 12:45:16
> Subject Re: [Babel-users] Restarting MeshPoint – seeking advice on
> routing for crisis/disaster scenarios
>
> >>  There's also l3roamd, predating sroamd:
> >>
> >>  https://github.com/freifunk-gluon/l3roamd
> >
> >That's right, I should have mentioned it.  I'll be sure to give proper
> >credit if I ever come back to sroamd.
> >
> >For the record, sroamd is based on a combination of the ideas in l3roamd
> >and in the PMIPv6 protocol, plus a fair dose of IS-IS.
> >
> >-- Juliusz_______________________________________________
> Babel-users mailing list
> Babel-users at alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/babel-users/attachments/20251220/2df9764c/attachment.htm>