[Babel-users] About the new hysteresis algorithm

Juliusz Chroboczek jch at pps.univ-paris-diderot.fr
Fri May 3 18:20:46 UTC 2013


Here's an explanation of the hysteresis algorithm used in 1.4.

Every route comes equipped with a metric m.  In Babel, this is
a 16-bit unsigned integer, with the value 0xFFFF reserved to mean
infinity (unreachable route, or explicit route retraction, depending
on your point of view).

In 1.4, we maintain two values for every route: the metric m, which is
just the same as in regular Babel, and the smoothed metric sm, which
tracks m with an exponential decay.  So,

  - if the route is stable, sm is very close to m;
  - if the route has recently suffered instability, sm and m are
    different.

There are now two orderings on routes.  The strong ordering, defined
by

  m <= m' AND sm <= sm'

which is a partial (pre-)order, and the weak ordering, defined by

  sm <= sm'

which is a linearisation of the strong ordering.  (Any reasonably
linearisation of the strong ordering would do.)  Intuitively, the weak
ordering says that a route is better than another, while the strong
ordering says that a route is definitely better than another, and that
we should switch right now.

When we lose a route, we choose the best route according to the weak
ordering -- we don't have any route, we're going to switch anyway, so
we might as well chose the best one.  When we already have a workable
route, on the other hand, we don't want to switch unless the new route
is definitely better than the old one -- so we only switch to a route
that's strongly better than the old one.  The fact that the strong
ordering is partial is what leads to hysteresis.

My tests indicate that this algorithm works beautifully when there are
enough routes to chose from -- delaying switching doesn't matter much
then, and the hysteresis is pretty good at swallowing the oscillations
due to the negative feedback loop in ETX.  On the other hand, when
there aren't enough routes, we often lose the current route before ETX
has reconverged -- the hysteresis adds a few seconds to the
reconvergence time.

-- Juliusz



More information about the Babel-users mailing list