[Aptitude-devel] external solver support?

Fri Sep 19 14:19:15 UTC 2008

On Fri, Sep 12, 2008 at 12:22:14PM +0200, Stefano Zacchiroli <zack at debian.org> was heard to say:
> On Thu, Sep 11, 2008 at 07:58:12AM -0700, Daniel Burrows wrote:

[snip]

> Yes, the *interactive* UI problem are out of the game for now, but I
> can't stop thinking that most of the interactiveness part can be
> addressed by a more expressive request language to be fed to the package
> manager (think about APT pinning). Having that one can later on hide the
> language using an interactive UI, still retaining a simple
> non-interactive solver. OK, this can induce duplicated work for the
> solver, but one can than cache clauses or stuff like that ...
> 
> [1] we have found even simple examples in which all solvers we have in
>     Debian fail to find a solution where on exists and can be easily
>     found by any SAT solvers

  Really?  The only cases I'm aware of that occur in the wild are things
like the bug that was recently reported against apt, which involve a full
system upgrade with lots of complicated dependencies.

> >   In point 4, you mention a tradeoff between efficiency and the search
> > for a better solution.  Another tradeoff that you should keep in mind is
> > the tradeoff between algorithmic sophistication and comprehensibility.
> > That's comprehensibility for external developers (who need a reasonable
> > model of how the thing works and what their adjustments of its
> > parameters will do), for users (who need to understand why it produced
> > the result that it did), and even for internal developers (someday
> > someone who doesn't understand what you did will have to maintain your
> > code, and it might be you. :-) ).  Along with comprehensibility (but not
> > identical to it) is simplicity/robustness -- dragging in a bunch of
> > external code-bases (SAT-solvers, theorem provers, etc) seems problematic
> > to me.
> 
> Well, you are assuming that the solver code is something which requires
> active maintenance and I'm not convinced about that. I still hope we can
> find *the* good solver and be done with that, but of course I understand
> your concern.

  Every piece of code needs to be maintained.  And even if it doesn't,
users and people coding against it need to understand how it works.
The most complicated algorithm I can think of that works as a complete
black box is sorting (and even there, if your lists are large enough,
you need to open it up).

  The only nontrivial piece of code that springs to mind that is
basically zero maintenance is TeX, and it only got that way by dint of
being totally frozen in capabilities and debugged for years.  (note that
*LaTeX* is not maintenance-free)

> > I also heard a rumor at one point that you were considering
> > writing solvers in O'Caml, which doesn't seem to me like a viable choice
> > for a core system tool.
> 
> That's not a big deal, and that's why I would like to have the ability
> to invoke _external_ solvers which can be written in whatever language.
> If people like to write them in OCaml they can do it, if the algorithm
> works nothing inhibit to then implement it in whatever other languages.
> Again I've a political point here: if the people working on algorithms
> like to program in OCaml I don't want to stop them, I would like to be
> able to tell them something like: «go implement this binary API, then we
> can try your solver with aptitude».

  Mhm.  I don't see a problem with enabling this, but someone should
rewrite the solver in C++ if you want it included in apt (at least,
that's my opinion).  This isn't so much a technical objection (although
the dependency / build-dependency situation around O'caml scares me):
every new language you introduce decreases the number of people who can
work on the project.  If it's off in a box by itself, at least it won't
affect the rest of the program, but it does mean that it'll gather dust.

  The premiere example of this is Gnucash, where you need to speak (IIRC)
C, Scheme, and Perl (plus the GTK+ dialect of C) in order to do any
meaningful work on it.

> >   I guess what I'm saying is: I'm happy to pay costs on the simplicity/
> > comprehensibility side of the ledger if there's a corresponding benefit.
> > I'm somewhat skeptical that there's enough room to get a big benefit from
> > purely algorithmic changes, and I'm worried about making apt more
> > complicated, particularly given that we already can't find enough people
> > to maintain it.  Adding even more complexity invites the Gnucash Syndrome.
> 
> The only additional layer I want to add to apt/aptitude is a clear
> interface between the package manager and the solver, possibly the
> *same* interface for the two. How do you consider this? Would it be an
> excessive cost according to your point of view? (OK, it is hard to tell
> without having yet the interface, but if yours is a "no-way" it would be
> pointless to start thinking about one :-))

  I don't see a problem with having this sort of an interface present.
I just want to make sure you're aware of the problems before you run up
against them. :-)

> >   I'm curious how you intend to run a competition for upgrade tools.
> > In contrast to SAT-solvers and theorem provers, with dependency
> > resolvers you have to pick a *good* solution, not just any solution,
> > and the really hard thing is that there's no well-defined way to know
> > whether one solution is really better than another!  It seems to me
> > that this would present a bit of a problem when grading algorithms.
> 
> I feel we are getting off-topic here :), but anyhow ...
> 
> It is not true that we are already at the point where we only care for
> the *good* solution. With our current package managers there are cases
> in which solutions exist, but they can't be found by our tools. This is
> a first deficiency to be addressed, but is entirely implementative and
> not algorithmic.

  Ok.

> Then of course we will move to finding the *best* solution, and that
> need to be done according to criteria. There are some "reasonable"
> criteria one can use by default, like SMART's policies (minimize
> download size, consider priority in alternatives, minimize installed
> size, ...). In the beginning we will just set some of them.
> 
> Then, we would like to move to user-specified optimization criteria. The
> idea is to find a small language (like APT pinning, but with a clear
> formalization, and with access to package properties) the user can write
> down in a configuration file and optimize wrt that. Algorithmically, at
> this point you will be leaving plain SAT solving and enter the harder
> area of multi-criteria optimization ...

  But dependency resolution is *inherently* about multi-criteria
optimization.  Throwing a SAT solver at the problem, if they aren't
rigged for this sort of thing (and I don't know if they are as I haven't
used them myself) is probably not going to produce terribly good results.
Of course, you're welcome to try; maybe I'm wrong. :-)

> > > Yes, UI is a problem, and of course the problem is whether or not there
> > > is an API which will make happy everybody. My idea is to start drafting
> > > one that will satisfy aptitude needs and see how flexible we can get it.
> > 
> >   OK.  I think the first question is, how are you going to actually
> > implement a plugin system?  Will you run the resolver in-process as a
> > shared object, or via some sort of pipe interface like a download method?
> 
> "plugin system" is quite a mouthful :), my goal would be the ability in
> apt/aptitude to invoke an *external process* as a solver. The
> requirement of the process to be external is on one hand to leverage
> implementation of solvers in whatever language, on the other to enable
> shipping solvers in Debian packages other than apt/aptitude.

  You can probably base it on the protocol used to communicate with
download helpers.

> > otherwise I suggest a two-way protocol where the resolver asks for
> > information as it needs it (that's on the grounds that it's probably too
> > expensive to pass all the state over a pipe and parse it at the other
> > end).  For instance: "tell me what's broken", "tell me what I can
> > install to resolve this dependency", etc.
> 
> Well, at least initially I think the whole state should be passed to the
> solver. I've a rough idea that sending that all in a binary format would
> be something like 400Kb, it shouldn't be that slow through a pipe (but
> more benchmarks are needed of course).

  Hm, I suppose.  The binary format is going to be harder to debug, but
I guess it should be pretty stable.

> >   Do you want the resolvers to work on the raw apt cache or on a
> > simplified model of it?  In aptitude, the code to resolve packages [0]
> 
> If possible, I would prefer to work on a simplified model. Ideally, the
> interface should not be apt-specific.

  Have I pointed you at my little write-up of aptitude's model yet?

    http://people.debian.org/~dburrows/model.pdf

  Nothing very exciting, but it might give you some ideas.

  Daniel