[Aptitude-devel] Design notes on producing better (or at least more predictable) resolver results.

Sat Feb 14 16:35:54 UTC 2009

  Hi, all,

  I've been reminded recently of a lingering issue with the aptitude
resolver.  When I designed it I didn't worry too much about the order in
which solutions were produced, with the idea being that users would just
skip over ones they didn't like, using approve/reject to push the
resolver towards ones they did like.  Since the first answer will never
always be ideal, I figured it would be better to give people tools they
could use to solve the cases when it isn't.

  In practice, what seems to happen is that people see a solution that
removes packages, they reflexively tell aptitude to remove the packages,
panic, and write the nearest mailing list / forum crying for help and
talking about how aptitude sucks.  Now, I don't really care how many
people think my code sucks, but I do care about it being useful, and it
seems to me there may be room for improvement here.

  The thing that people find most surprising is that aptitude sometimes
suggests removing packages when installing a few packages would also fix
the problem.  The resolver is designed around a "best-first" search
strategy, where problems are assigned a numerical score and the most
promising solution is pursued.  Because I was stuck in this mental
model, I couldn't see a good way to provide a more reliable ordering of
solutions -- after all, installing packages is "bad" and so is removing
packages, so all we're debating is how many installs are as "bad" as one
removal.  Also, penalizing removals too heavily seems dangerous, since
it can lead to a situation where the resolver thrashes around forever
trying to avoid a necessary removal.

  But in fact, I have trouble thinking of any situation where it is
reasonable to prefer removals to installs, and I can't think of *any*
situation where it is *un*reasonable to prefer installs to removals
(even if it's not what the user wants, at least it doesn't make them go
"WTF!?!").  And I've finally thought of a clean, simple, easy mechanism
that will provide the desired predictability, while also allowing some
elegant optimizations that should defuse the worries I had about the
performance impacts of going down this road.  I'm still not sure whether
the semantics are right, which is part of why I'm writing this list.

-------- Intermezzo in breadth search ----------

  To explain this, it's useful to consider two other subsystems of
aptitude first: the dependency chain search and the safe resolver.

  The dependency chain search uses a simple breadth-first algorithm to
find chains of dependencies between two packages.  Because it's
breadth-first, it always finds the shortest chains first.  But not all
chains are considered equal: for instance, chains that include only
installed packages are generally more interesting than chains that
include some installed and some not-installed packages.  At least, this
is certainly true for the purposes that the dependency chain search is
meant for (e.g., finding out why a package is pinned on the system).

  In order to ensure that the most "interesting" chains are shown first,
regardless of their length, the dependency chain finder actually
performs a series of searches using various parameters.  I'll call these
"search tiers".  The first search tier only looks at installed packages,
only looks at Depends relationships, and doesn't follow dependencies
that have more than one alternative.  The next tier is the same, but it
will follow dependencies that have more than one alternative ... and so
on, down to a final search that will find any chain no matter what its
characteristics.  These searches are combined in a way that removes
duplicates, so no chain will ever be shown more than once.

  The safe dependency resolver restricts the normal resolver to only
install the candidate version of a package or hold it at its current
version. It does this by selectively rejecting all the other versions of
each package; any search node that touches one of these versions will be
placed on the internal "deferred" queue.  If the dependency resolver ran
interactively, the user could actually un-reject some of these versions
and start seeing solutions that had been deferred because they contained
a rejected version.

  Comparing this to how we search for dependency chains, it is clear
that this is a sort of "proto-tier" of dependency results.  My idea is
to formalize a system of tiers for dependency results in a consistent
way that slots nicely into the existing logic.

-------------- Tier-ey Eyed Search --------------

  Tiers are, in effect, a redefinition of the "score" of a search node.
Instead of being a simple number, each score will be a pair (TIER,
SCORE) where both TIER and SCORE are integers.  Scores will be compared
by tier first, then by score, with lower tiers comparing HIGHER than
higher tiers: in other words, lower tiers are examined and produced
"before" higher tiers of the output.  So a node with a score of (3, 100)
will be shown before a node with a score of (4, 500) and after a node
with a score of (1, -20000).  (I might reverse the comparison, so that
lower tiers are also lower quality, but I thought of this first using
tiers to order the results, and it still makes more sense to me)

  Scores are calculated by summing up the scores attached to the
versions in a search node (actually, the difference between those scores
and the scores attached to the "current" versions that they replace),
and adding some extra "control" weights.  In contrast, while tiers are
*defined* on a version-by-version basis, the tier of a search node is
*calculated* by taking the *maximum* tier of any version that the node
contains.  This has the important effect that tiers always increase
monotonically as a search node is worked on, so if a search node
contains a version at tier 3, no solution that is a successor of this
search node will be produced before all solutions at tier are
discovered.

  As an extension, I might implement support for "joint tiers": saying
that the presence of a certain collection of versions causes a search
node to have tier N or higher, regardless of the tiers of the individual
versions.  I use this below for the do-nothing tier.

  The default tiers might require some work, but here's what I'm
currently thinking:

  Tier 1,000 (a.k.a. "safe-resolver"): solutions that only install the
                                       candidate versions of packages
                                       and/or keep packages at their
                                       current versions.

  Tier 2,000 (a.k.a. "do-nothing"):    the unique solution that cancels
                                       all the user actions; implemented
                                       via a joint tiering.

  Tier 3,000 (a.k.a. "spring-cleaning"):
                                       solutions that remove packages.

  Tier 4,000 (a.k.a. "living-dangerously"):
                                       solutions that install non-default
                                       versions of packages.

  I decided to show non-default versions after options that remove
packages because otherwise anyone who has more than one release in their
sources.list will be presented with any number of possible solutions
involving installing packages from other releases, even if the "obvious"
solution is to remove some packages.  People who are working with other
releases can either set the default release appropriately or pin
packages that they care about, or just skip past the initial suggestions
to remove stuff.  It might also be useful to consider merging these two
tiers and just let the current balancing mechanism decide in which order
removals and non-default versions should be shown.

  I'm not sure whether the do-nothing tier should be where it is;
arguably it should be after living-dangerously so that removing a
library or installing a package from experimental behave as expected.

  The reason for starting at 1,000, and for leaving gaps of 1,000
between tiers, is that I intend to extend the resolver hints mechanism
with a facility that allows a package to be placed at a particular tier.
(I'm not quite sure how conflicts between hints will be resolved: taking
the upper or lower bound seems wrong, as does adding the tiers together.
Maybe pick an arbitrary one and issue an APT error?)  Leaving big gaps
between the tiers ensures that users have room to fit their custom tiers
in between.  And while it's possible to have negative tiers, starting at
1,000 means that users don't have to realize that it's possible in order
to insert a tier before the first one.

  This gives users a nice and highly predictable way to control the
problem resolver, based on expressing their preferences about the
relative "goodness" of various package versions.

------------ Conflicts and Tiers -------------

  The second important thing about tiers is performance-related.  While
tiers could be implemented right now, by simply tweaking the way search
nodes are scored (and this might be the first implementation), the fact
that they increase monotonically allows a key optimization to be
applied.

  To explain that, let me back up and explain something called
"conflicts".  These are not package conflicts expressed in dependency
fields; inside the resolver, a "conflict" is simply the knowledge that
versions A and B are mutually incompatible.  For instance, version A
might require a package that conflicts with B.  As the resolver runs, it
builds up more and more complex conflicts: for instance, version C
requires D or A, both of which conflict with B, and hence C conflicts
with A.  These are critically important in optimizing some common
situations; they effectively allow entire branches of the search space
to be thrown out unexamined.

  Conflicts are created when a search node has no legal successors.
This happens when aptitude picks a dependency that's broken under the
node and attempts to resolve it, but all the resolutions are forbidden
by choices the resolver has already made.  For instance, aptitude might
find the dependency

    vim 7.2 -> libncurses5 (>= 5.6)

  which is broken because vim 7.2 is installed by the current search
node, but the same node also installs libncurses5 version 5.5.  Because
aptitude never allows a node's successors to alter packages that are
already modified by the node itself, there are no legal successors to
this search node.  The important thing is what aptitude does next: it
figures out the reason that each of the possible resolutions was
disallowed, and stores the set of these reasons as a conflict.  For
instance, in this example, the reasons are that we installed vim 7.2
(preventing us from, e.g., removing vim or installing another version)
and that we installed libncurses version 5.5.  So, we can conclude that
vim 7.2 is not compatible with libncurses version 5.5.  By repeatedly
applying this rule, aptitude can quickly detect when a search node will
not yield a solution, and hence can be discarded immediately.

  Here's the key thing about tiers in this context: once a node "enters"
a particular tier, it and its successors will never again be in a lower
tier.  That means that if we can detect that all the successors of a
node have tier N, we might as well treat the node itself as having tier
N -- and if there is a set of decisions that "forced" the node to tier
N, those decisions can be accumulated and stored as "tier promotions".
Each tier promotion is a set of versions combined with a tier number T;
any search node containing those versions is automatically elevated to
tier T.  As with conflicts, these should backpropagate (so if all the
successors of a node contain promotions to tier T or higher, that node
should be promoted to tier T, and a new tier promotion might be
created).

  So to take a simple example, suppose that vim 7.2 is tier 1, but each
version of libncurses5 which is >= 5.6 is tier 2.  When we encounter the
above dependency in a search node that installs vim 7.2, we should
recognize that the presence of vim 7.2 *forces* us to install a version
of libncurses at tier 2, and hence vim 7.2 should be promoted to tier 2.

  It's interesting to note that the mechanisms for conflicts and tiers
look extremely similar.  In fact, from an implementation point of view,
it would be useful to add two extra tiers:

"INFINITY":      search nodes that are unresolvable and will be
                 discarded.
"INFINITY" - 1:  search nodes that violate a constraint that was
                 manually added by the user; they will be set aside
                 until the constraints are modified (then reset to their
                 default tiers and dumped back on the open queue).

  For "INFINITY", substitute INT_MAX. :-)  This allows conflicts and
tiers to be unified with a minimum of special casing: we'll need to
special-case the code to set aside / discard these search nodes, but at
least the code to detect and insert the conflicts will remain more or
less the same.

  This also introduces another new feature: the ability to do early
detection of nodes that violate user constraints.  The reason this
matters is, among other things, that I've actually seen cases in the
past where *user constraints* caused the resolver to blow up,
specifically because aptitude doesn't accumulate the same high-level
information that it does for dead-ends in the search, and so it would
keep trying all possible search paths that led up to a constraint
horizon, rather than recognizing that it was really doing the same thing
over and over again.  (this is why, IIRC, aptitude currently prioritizes
dependencies that "touch" a user constraint; that ensures that it finds
rejected versions as soon as possible, rather than after it's already
branched several times)

  I estimate this at between one and two full days' worth of hacking --
it's mostly a matter of getting back in touch with the details of the
resolver engine, then tweaking the conflict mechanism to use tiers
instead.  So given the accuracy of my estimates in the past, it should
be not more than a 40-hour week of work. :-)

  Daniel