[Aptitude-devel] external solver support?

Thu Sep 11 14:58:12 UTC 2008

On Thu, Sep 11, 2008 at 10:21:58AM +0200, Stefano Zacchiroli <zack at debian.org> was heard to say:
> On Tue, Sep 09, 2008 at 07:10:14PM -0700, Daniel Burrows wrote:
> > I'm not sure what the motivation is for creating a plug-in
> > architecture rather than just working on apt directly.  It seems like
> > a way to complicate the problem and introduce a lot of fragility in a
> > fairly central piece of system code.
> 
> The original motivation was specific of the Mancoosi project [1]. As we
> are going to test several different solving algorithms, we would like to
> have a way to test them without affecting directly the package manager
> code base. Of course we don't want to forcibly push our changes to the
> package managers for that reason, but since when I discussed with
> Michael he was excited by the idea, I thought about pushing it forward
> to other package managers as well.
> 
>   [1] http://www.mancoosi.org

  Looks like an interesting project.  I do wonder how much work we
really need on dependency solvers: aptitude's is quite naive, yet it
solves virtually everything I throw at it in seconds and typically
produces reasonable results.  The main deficiency I would like to fix
(but I'm not sure how) is that its answers are not always very
predictable: from a practical point of view, we would like package
maintainers to be able to specify things like the default resolution
of a dependency, and aptitude doesn't have much support for that.  A
second deficiency is that I would like to be able to partition the
dependency problem into "independent sub-problems" -- not just
algorithmically, but also interactively, so that the user can explore
the individual parts of the problem rather than having to deal with the
whole thing at once.  That will require a radical change to how the
resolver is defined and used; again, not really something that you can
accomplish with a plug-in architecture.  A third issue is that the
resolver should know about which packages are going to be unused and
take that into account (that would make it less eager to, e.g., remove
all of Gnome to solve a dependency).

  So I guess the first and third points are fixable by dropping new
solvers in.  The third is a pretty simple matter of finding a
reasonably efficient way to add a list of unused packages to each
search node.  Do you have ideas about how to fix the first one without
compromising speed or quality of results too much?

  In point 4, you mention a tradeoff between efficiency and the search
for a better solution.  Another tradeoff that you should keep in mind is
the tradeoff between algorithmic sophistication and comprehensibility.
That's comprehensibility for external developers (who need a reasonable
model of how the thing works and what their adjustments of its
parameters will do), for users (who need to understand why it produced
the result that it did), and even for internal developers (someday
someone who doesn't understand what you did will have to maintain your
code, and it might be you. :-) ).  Along with comprehensibility (but not
identical to it) is simplicity/robustness -- dragging in a bunch of
external code-bases (SAT-solvers, theorem provers, etc) seems problematic
to me.  I also heard a rumor at one point that you were considering
writing solvers in O'Caml, which doesn't seem to me like a viable choice
for a core system tool.

  I guess what I'm saying is: I'm happy to pay costs on the simplicity/
comprehensibility side of the ledger if there's a corresponding benefit.
I'm somewhat skeptical that there's enough room to get a big benefit from
purely algorithmic changes, and I'm worried about making apt more
complicated, particularly given that we already can't find enough people
to maintain it.  Adding even more complexity invites the Gnucash Syndrome.

  Er, I hope that doesn't sound too negative.  If I take my apt/aptitude
maintainer hat off, I'd love to just sit around and play with constraint
solving algorithms, but as a project maintainer I also have to think
about what's best for the long-term viability of the codebase.  Maybe
when you have some concrete solvers that are amazingly great and deal
with every upgrade we could ever see with no user interaction and in ten
lines of simple code I'll be less skeptical. :-)

  I'm curious how you intend to run a competition for upgrade tools.
In contrast to SAT-solvers and theorem provers, with dependency
resolvers you have to pick a *good* solution, not just any solution,
and the really hard thing is that there's no well-defined way to know
whether one solution is really better than another!  It seems to me
that this would present a bit of a problem when grading algorithms.

  The biggest deficiency in existing algorithms is that they sometimes
produce results that aren't "sensible", but I don't see any good way to
define what a "sensible" result is, except maybe by collecting a large
number of situations that can occur and defining sensibility by hand in
each case.  But one of the problems in dependency resolution is that
you run into situations that are formally identical (i.e., the
dependencies look the same to the resolver), but where the desired
outcomes on the user's end are totally different; it all depends on what
the actual packages are that are being manipulated, which the computer
doesn't know -- unless you have some formidable AI.  (incidentally, that
was one of the main motivations behind making the aptitude resolver
interactive: no matter what you do, sometimes your resolver will
produce a stupid answer (where a stupid answer is one that makes the
user say, "my, what a stupid answer that was!"), and letting the user
override those answers is a big advantage)

> Most of your objections seems to come from the assumption of "one true
> solver" and I understand that, but at least in the interim we will play
> with several solvers (to find the "good" one, if exists) and in that
> scenario a plugin architecture is more appropriate.

  Not so much "one true solver"; it's more that I doubt we can get a
lot of benefit from purely algorithmic changes (which is all you can do
behind a plugin interface).  But is some room for improvement.

> > On the other hand, apt is a fairly scary codebase to work with,
> > which is one of the reasons aptitude has grown a lot of stuff that
> > really ought to be in apt (and I suspect the same is true for pbuilder
> > et al).
> 
> Yep, this is probably the reason why a lot of more advanced solving
> logics have been implemented during years elsewhere than in apt. Still,
> about your claim that the solver of aptitude should be in apt I'm
> missing something: what about the interactive part? Would you be OK with
> an API to interact with apt that let you interactively pause/restart the
> solver when needed? It is a bit obscure how can you get rid of that
> solver logics from aptitude otherwise ...

  That's a good point: the resolver_manager might not be appropriate
for apt, with all its threading code and hooks into aptitude.  But the
core resolver algorithm (problemresolver.h) just exports a function that
says "get me the next solution", and some functions to twiddle its
parameters.  That's entirely appropriate for a generic library.  (oh, I
think there is one synchronized variable, to check whether someone wants
to break out of the current get_next_solution call)

> >   I also don't think a plugin architecture will help with what I
> > consider the main difficulty with resolver algorithms, namely improving
> > their UI.  In order to improve the UI by introducing new algorithmic
> > capabilities, you have to change the interface between the resolver and
> > the main program.  For instance, the aptitude algorithm lets the user
> > see solutions beyond the first one the algorithm produces, and ask for
> > only solutions that don't remove a particular package -- two new UI
> > capabilities that you can't get with the vanilla apt resolver.  In
> 
> Yes, UI is a problem, and of course the problem is whether or not there
> is an API which will make happy everybody. My idea is to start drafting
> one that will satisfy aptitude needs and see how flexible we can get it.

  OK.  I think the first question is, how are you going to actually
implement a plugin system?  Will you run the resolver in-process as a
shared object, or via some sort of pipe interface like a download method?

  Once you have that decided, you'll need to figure out how to share the
model of the package database between the resolver and the main program.
If it's in-process, you can probably just share data structures;
otherwise I suggest a two-way protocol where the resolver asks for
information as it needs it (that's on the grounds that it's probably too
expensive to pass all the state over a pipe and parse it at the other
end).  For instance: "tell me what's broken", "tell me what I can
install to resolve this dependency", etc.

  Do you want the resolvers to work on the raw apt cache or on a
simplified model of it?  In aptitude, the code to resolve packages [0]
is factored out of the code that interprets the apt cache [1].  The
code that handles the apt cache directly is much smaller -- but it's
where most of the bugs have been (because, again, apt's model of the
world is quite hairy).

  [0] src/generic/problemresolver/problemresolver.h
  [1] src/generic/apt/aptitude_resolver_universe.{cc,h}

  Once I know the general shape of what you're trying to do, I could
help design an API/ABI that aptitude could use.

  I would like to reiterate my strong opinion that package dependency
resolution is not fundamentally an algorithmic problem: it's a UI
problem with algorithmic elements.  Designing better and better
algorithms is a worthy goal, but if you take that as your main target,
you could risk of chasing diminishing algorithmic returns at the expense
of the overall usefulness of your program.

  Daniel