[Aptitude-devel] r3015 - in branches/aptitude-0.3/aptitude: . src/generic/problemresolver
Daniel Burrows
dburrows@costa.debian.org
Wed, 20 Apr 2005 17:19:38 +0000
Author: dburrows
Date: Wed Apr 20 17:19:35 2005
New Revision: 3015
Added:
branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex
Modified:
branches/aptitude-0.3/aptitude/ChangeLog
Log:
Add a writeup of the theory behind the problem resolver.
Modified: branches/aptitude-0.3/aptitude/ChangeLog
==============================================================================
--- branches/aptitude-0.3/aptitude/ChangeLog (original)
+++ branches/aptitude-0.3/aptitude/ChangeLog Wed Apr 20 17:19:35 2005
@@ -1,3 +1,10 @@
+2005-04-20 Daniel Burrows <dburrows@debian.org>
+
+ * src/generic/problemresolver/model.tex:
+
+ Add some discussion of the theoretical basis for the new problem
+ resolver.
+
2005-04-10 Daniel Burrows <dburrows@debian.org>
* src/cmdline/cmdline_do_action.cc, src/generic/aptitude_resolver.cc, src/generic/aptitude_resolver.h:
Added: branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex
==============================================================================
--- (empty file)
+++ branches/aptitude-0.3/aptitude/src/generic/problemresolver/model.tex Wed Apr 20 17:19:35 2005
@@ -0,0 +1,569 @@
+\documentclass[letterpaper]{article}
+
+\usepackage{fancyhdr}
+\usepackage{url}
+\usepackage{pgf}
+\usepackage{amsmath}
+\usepackage{amsthm}
+\usepackage{semantic}
+
+\pagestyle{fancy}
+
+\newtheorem{theorem}{Theorem}
+\newtheorem{lemma}[theorem]{Lemma}
+
+\theoremstyle{definition}
+\newtheorem{definition}[theorem]{Definition}
+
+\theoremstyle{remark}
+\newtheorem*{note}{Note}
+\newtheorem*{remark}{Remark}
+
+\lhead{Modelling and Resolving Software Dependencies}
+\rhead{Daniel Burrows}
+
+\author{Daniel Burrows \url{<dburrows@debian.org>}}
+\title{Modelling and Resolving Software Dependencies}
+
+\newcommand{\pkg}[1]{\text{\url{#1}}}
+\newcommand{\apt}{\pkg{apt}}
+
+\renewcommand{\P}{\mathcal{P}}
+\newcommand{\V}{\mathcal{V}}
+\newcommand{\I}{\mathcal{I}}
+\newcommand{\D}{\mathcal{D}}
+\newcommand{\U}{\mathcal{U}}
+\newcommand{\pkgof}[1]{PkgOf(#1)}
+
+\newcommand{\st}{\ |\ }
+
+\newcommand{\idist}[2]{\langle#1,#2\rangle}
+\newcommand{\len}[1]{\lvert#1\rvert}
+
+\newcommand{\nsol}[1]{\overset{#1}{\Rightarrow}}
+\newcommand{\nsolmany}[1]{\overset{#1}{\underset{*}{\Rightarrow}}}
+
+\mathlig{->}{\rightarrow}
+\mathlig{=>*}{\overset{*}{=>}}
+
+\begin{document}
+\maketitle
+
+\begin{abstract}
+ Many Linux distributions and other modern operating systems feature
+ the explicit declaration of (often complex) dependency relationships
+ between the pieces of software that may be installed on a system.
+ Resolving incompatibilities between different pieces of software is
+ an NP-complete problem, and existing solutions require the user to
+ manually resolve many ``simple'' dependency problems.
+
+ I present a simplified, abstract model of dependency relationships,
+ and a restartable technique based on best-first-search to calculate
+ resolutions.
+\end{abstract}
+
+\section{Introduction}
+
+It is common nowadays for hundreds or thousands of software packages
+to be installed on a single computer system, and for many of these
+software packages to interact with one another. Because some
+combinations of software packages will not function properly -- for
+instance, an application program might require a particular version of
+a graphics library -- installing software manually while avoiding
+unexpected breakage is an increasingly unpleasant chore.
+
+To address this problem, programs known as \emph{package systems} were
+developed. A package system typically manages \emph{packages} that
+consist of the files of a program or library, along with metadata such
+as the name and version of the package, a brief description of what it
+contains, and (most importantly for our purposes) a list of which
+other packages it requires or is incompatible with. The package
+installation software warns the user upon any attempt to install or
+remove software that would violate these constraints.
+
+Unfortunately, the early versions of these tools replaced the chore of
+manual software installation with the chore of dependency resolution:
+for instance, installing a package of the popular game \pkg{wesnoth}
+might produce an error indicating that the user should find, download,
+and install a new version of the \pkg{SDL} graphics library. A new
+version of the \pkg{kmail} mail client might require the user to
+upgrade his or her entire operating system, generally indicating this
+fact by a slew of cascading error messages.
+
+As a result of this ``dependency hell'', a new generation of automated
+tools, such as \apt and \pkg{up2date}, was developed. These
+tools maintain a database of software installed on the user's system,
+along with software available from any number of remote sites. To
+install or remove a piece of software, the user issues a request to,
+for instance, ``install wesnoth'' or ``upgrade kmail''. The
+installation tool will proceed to find a set of package installations
+or removals which leads to a consistent result. Typically, it then
+presents this list of actions to the user and prompts for
+confirmation; the user can either accept the proposed solution, or
+reject it and proceed to fix the problem in a fully manual way.
+
+This approach has two major drawbacks:
+
+\begin{enumerate}
+\item The user interface for resolving dependencies is a ``take it or
+ leave it'' proposition: there is no way for the user to ask the
+ algorithm to find another solution. This means that if the
+ algorithm makes a poor or undesired choice (which, as I will argue
+ below, will inevitably occur sometimes) the user is forced to fall
+ back to fully manual operation.
+
+\item In at least some cases (particularly \apt), the algorithm used
+ in resolving dependency conflicts deals poorly -- which is a
+ euphemism for ``not at all'' -- when there are more than two
+ versions of a package to choose from\footnote{More precisely, if
+ more than one version other than the currently installed version
+ (if any) exists.}. For instance, if versions 1, 2, and 3 of
+ package \pkg{A} are available, with 2 being the default version of
+ the package, and if package B requires version 3 of package A, when
+ the user tries to install package B, he or she will receive an error
+ message indicating that the dependency on A cannot be fulfilled.
+\end{enumerate}
+
+Another general difficulty in solving dependencies in these systems is
+that the package systems contain many features which, although they
+are arguably ``syntactic sugar'', tend to cause algorithms that
+operate on packages to become strewn with complex iteration constructs
+and unpleasant corner cases. Although some attempts have been made to
+find general models of package dependencies (for instance, the
+internal structures of \apt can represent either Debian or Red Hat
+packages), the models with which I am familiar work by taking a
+``greatest upper bound'' of the systems that they cover, leading to an
+even more convoluted architecture.
+
+\begin{note}
+ I have not yet performed an extensive survey of package systems, and
+ it may be that there already exist systems that fix one, two, or all
+ of the drawbacks listed above.
+\end{note}
+
+\section{Example: the Debian Package System}
+
+In the Debian package system, each package may have one or more
+versions, but at most one version of each package may be installed at
+any given time. The basic relationships between packages are
+\emph{dependencies} and \emph{conflicts}. For instance, version
+6.14.00-1 of the \pkg{tcsh} command shell depends on version
+2.3.2.ds-4 or greater of the \pkg{libc6} package, and version 5.4-1 or
+greater of the \pkg{libncurses5} package: it may not be installed
+unless an appropriate version of each of these package is installed.
+On the other hand, the same package conflicts with all versions of the
+\pkg{tcsh-kanji} and \pkg{tcsh-i18n} packages: \pkg{tcsh} may not be
+installed at the same time as either of these packages.
+
+A single dependency may name several packages, combined with and OR
+operator (indicated by a vertical bar). For instance, version
+\pkg{1.4.48} of the \pkg{debconf} package depends upon
+\verb!debconf-i18n | debconf-english!; in order to install debconf,
+you must also install one of these two packages.
+
+Last but not least, dpkg supports what are known as ``virtual''
+packages. Each version of each package may \emph{provide} one or more
+package names; each named package will become a virtual package.
+Virtual packages can have the same name as a normal package, in which
+case they are known as \emph{mixed} virtual, or they can exist only
+through being provided by a normal package, in which case they are
+known as \emph{pure} virtual. An unversioned dependency on a virtual
+package will be satisfied if any provider of the name is installed,
+while an unversioned conflicts will require that all providers of the
+name are removed -- but as a special case, direct or implicit
+self-conflicts are ignored. Versioned dependencies and conflicts do
+not, as of this writing, follow provided package names.
+
+For instance, the virtual package \pkg{mail-transport-agent} is used
+to identify packages containing mail transport agents. Every such
+package both provides and conflicts with the virtual package, and
+packages that require a mail transport agent depend on
+it.\footnote{Due to a quirk in how \pkg{apt} resolves dependencies,
+ dependencies on a virtual package are required to include a real
+ package as an alternative: for instance, \pkg{bugzilla} depends on
+ \url{sendmail | mail-transport-agent}.}
+
+\section{An Abstract Model of Dependency Relationships}
+
+As can be seen in the previous section, real dependency systems are
+complex. This tends to complicate the business of reasoning about how
+to find solutions to dependency problems, and to cause algorithms that
+manipulate dependencies to become horribly messy. In situations like
+this, it is often a good idea to find a simpler, more mathematical
+model of the problem to be analyzed. Of course, it is well-known that
+package dependencies can be reduced to the satisfaction of Boolean
+equations, but such a reduction is arguably \emph{too} extreme: it
+certainly results in a mathematical model, but the model it produces
+hides the structure of the original problem. I will instead propose a
+model which is sufficient to describe any dependency problem (at least
+in Debian) and retains the structure of a package system.
+
+\subsection{Basic Concepts}
+
+In this simplified model, the only objects in the world are packages,
+versions of packages, and dependencies. Packages will typically be
+denoted by $p_1, \dots$; versions will typically be denoted by $v_1,
+\dots$; and dependencies are of the form $v -> \{v_1', \dots\}$,
+indicating that the version $v$ requires the versions $\{v_1',
+\dots\}$. The package associated with a version $v$ is denoted by
+$\pkgof{v}$.
+
+To represent the state of the entire system, the following sets are
+available:
+
+\begin{itemize}
+\item $\P$ is the set of all packages.
+\item $\V$ is the set of all package versions.
+\item $\D$ is the set of all dependencies.\footnote{Not the set of all
+ \emph{potential} dependencies, but the set of all dependencies
+ asserted in the current package system.}
+\end{itemize}
+
+\subsection{Reduction of Debian Dependencies to the Model}
+
+As claimed above, it is possible to reduce a Debian dependency system
+to this abstracted model. The reduction proceeds in approximately the
+following way:
+
+\begin{itemize}
+\item $\P$ is the set of Debian packages.
+\item $\V$ is the set of versions of those packages, plus one
+ additional version for each package. This version represents the
+ state of the package when it is not installed. Versions
+ corresponding to versions of the Debian package are indicated by $p
+ : n$ where $n$ is the version number, while the ``uninstalled''
+ version is indicated by $p_u$.
+\item For each dependency of the version $v$ of a package on
+ $\pkg{A}_1\ |\ \dots$, accumulate a set $S$ containing all the
+ matching versions of each named package, combined with every package
+ version that provides a named package (if the dependency is
+ unversioned).
+
+ For instance, if $v$ declares a dependency on \verb!A (>= 3.2) | B!,
+ versions $3.1$, $3.2$, and $3.3$ of package \pkg{A} are available,
+ versions $1$, $2$, and $3$ of package \pkg{B} are available, and
+ package \pkg{C} version $3.14$ provides \pkg{B}, then
+ $S=\{\pkg{A}:3.2,\pkg{A}:3.3,\pkg{B}:1,\pkg{B}:2,B:3,\pkg{C}:3.14\}$.
+
+ $\D$ contains $v -> S$ for every such dependency.
+\item For each conflict declared by a version $v$ on the package $p$,
+ accumulate a set $S$ containing all the \emph{non}-matching versions
+ of $p$ and insert $v -> S$ into $\D$. Furthermore, if the conflict
+ is not versioned, then for each package $p'$ and version $v'$ of
+ $p'$ such that $v'$ provides $p$, let $S=\{v'' \st \pkgof{v''}=p' \land
+ \text{$v''$ does not provide $p$}\}$ and insert $v -> S$ into $\D$.
+
+ For instance, if $v$ conflicts with \pkg{A}, of which versions 3.2
+ and 3.3 are available, and versions 2 and 3 of \pkg{B} provide
+ \pkg{A}, then $S=\{\pkg{A}:3.2,\pkg{A}:3.3,\pkg{B}:2,\pkg{B}:3\}$.
+
+ \begin{note}
+ In reality, extra care must be taken to screen out self-conflicts
+ in this process, but the description above is complicated enough
+ as it stands!
+ \end{note}
+\end{itemize}
+
+\begin{remark}
+ Although the above reduction is complicated to describe, its major
+ steps must be performed whenever any program is analyzing
+ dependencies: for instance, when listing all the versions that can
+ fulfill a dependency, it is necessary to iterate over all members of
+ each OR and to search their providing packages as necessary. Thus,
+ an ``on-the-fly'' reduction in an algorithm written for the generic
+ model is conceivably almost as efficient as an algorithm that works
+ with the Debian package structure directly.
+\end{remark}
+
+\subsection{Installations}
+
+An \emph{installation} represents a choice about which package
+versions to install; it is a function that maps each package to a
+version of that package. An installation $I$ is \emph{consistent} if
+it satisfies each dependency in the world: that is, for all $v ->
+\{v_1',\dots\}$ in $\D$, either $I(\pkgof{v}) \neq v$ or
+$I(\pkgof{v_i'})=v_i'$ for some $i$.
+
+If $I$ is an installation, then $I;p \mapsto v$ is a new installation
+which installs the version $v$ of the package $p$ and leaves all other
+packages unchanged:
+
+\begin{equation}
+ (I;p \mapsto v)(p') = \begin{cases}
+ I(p'), & p' \neq p \\
+ v, & p' = p
+ \end{cases}
+\end{equation}
+
+\section{The Dependency Resolution Problem}
+
+\subsection{Problem Definition}
+
+Let $I$ be an inconsistent installation. We would like to find a
+consistent installation that is ``similar to'' $I$. This is the
+dependency resolution problem. In a real package system, it
+corresponds to a situation in which the user's requests have resulted
+in an invalid state (with unsatisfied dependencies or conflicts); the
+goal of a package manager in this situation is to find a ``small'' and
+``good'' set of changes that result in a valid state.
+
+\begin{note}
+ This problem is poorly defined: ``small'' and ``good'' are not
+ precise terms. The goal, from a UI point of view, is to not change
+ too many packages, but to make reasonable decisions: for instance,
+ if the user has requested that some packages be installed and these
+ installations cause dependency clashes, ``solving'' the problem by
+ cancelling the installations is probably not the desired result.
+ However, while it might have obviously wrong solutions, \emph{this
+ problem has no principled correct solution}, because it is
+ possible that if several different users view a single dependency
+ problem, each prefers a different solution from the others. In
+ other words, some of the information necessary to find the ``best''
+ solution is inside the user's head.
+
+ Thus, the best we can do is to define some criteria for ``goodness''
+ and allow the user to see alternatives to an undesired solution.
+\end{note}
+
+\subsection{Dependency Resolution is NP-complete}
+
+In order to find a ``good'' solution, we must first find \emph{any}
+solution to the existing set of dependencies. Unfortunately, as shown
+below, this is an NP-complete problem.
+
+\begin{theorem}
+ Dependency resolution is NP-complete.
+\end{theorem}
+
+\begin{proof}
+ Proof is by reduction from CNF-SAT to the problem ``does a
+ consistent installation $I$ exist?''
+
+ Create one package for each variable and for each clause in the SAT
+ problem. For each variable $x$, let the versions of the
+ corresponding package be $x:0$ and $x:1$; for each clause, create
+ exactly one version. For each SAT clause let $v_c$ be the package
+ corresponding to the clause, and insert $v_c -> S$ into $\D$, where
+ for each literal of a variable $x$ appearing in the clause, $S$
+ contains $x:0$ if $x$ is a negative literal and $x:1$ if $x$ is a
+ positive literal. This reduction is clearly polynomial-time; I
+ claim that a solution to this set of dependencies exists if and only
+ if a solution to the corresponding SAT problem exists.
+
+ Suppose that there is an assignment that solves the SAT problem.
+ Define an installation $I$ as follows: if $p$ corresponds to a
+ clause, $I(p)$ is the single version of $p$; if $p$ corresponds to a
+ variable $x$, $I(p)=x:0$ if $x$ is FALSE in the SAT solution and
+ $x:1$ if $x$ is TRUE in the SAT solution. Now, consider any
+ dependency $v -> S$. From the construction above, $S$ corresponds
+ to a clause of the SAT instance. At least one literal in this
+ clause must be assigned the value TRUE (otherwise the clause is not
+ satisfied); let $x$ be the corresponding variable. If the literal
+ is positive, then (by construction) $S$ contains $x:1$ and (since
+ $x$ must be assigned the value TRUE) $I(p)=x:1$. On the other hand,
+ if the literal is negative, then $S$ contains $x:0$ and $I(p)=x:0$.
+ Thus, $I$ is a consistent installation.
+
+ On the other hand, suppose that there is a consistent installation
+ $I$. For all variables $x$, let $p$ be the corresponding package;
+ if $I(p)=x:0$, assign FALSE to $x$, and if $I(p)=x:1$, assign TRUE
+ to $x$. Now consider any clause in the SAT problem: from the
+ construction above, $\D$ contains a dependency $v_c -> S$ where
+ $v_c$ is the single version of the package corresponding to the
+ clause. Since $I$ must map the clause's package to $v_c$ and $I$ is
+ consistent, there must be a version $v \in S$ such that
+ $I(\pkgof{v})=v$; i.e., $I$ installs $v$. But from the
+ construction, $v^p$ corresponds to some variable $x$, and $v$ is
+ either $x:1$, where $x$ appears as a positive literal in the clause
+ or $x:0$, where $x$ appears as a negative literal in the clause.
+ Thus, the clause is satisfied, and so the assignment described above
+ satisfies all clauses.
+
+ Therefore, dependency resolution is NP-complete.
+\end{proof}
+
+\subsection{Don't Panic}
+
+Although the problem at hand is NP-complete in general, there is good
+reason to think that the instances that arise in practice are
+tractable. It is well-known that many NP-complete problems have
+``easy'' and ``hard'' instances: some instances of the problem can be
+solved quickly by relatively naive algorithms, while others are
+intractable even using the most sophisticated techniques available.
+
+In the particular case of package dependencies, the traditions that
+have grown up around package tools seem to encourage the creation of
+easy instances of the dependency problem; furthermore, the user's
+desired installation is typically consistent or ``almost consistent''
+(meaning that few dependencies are violated). It is usually
+straightforward, when solving problems in an ad hoc way, to isolate a
+small part of the dependency graph in which the problem occurs; for
+instance, by informally applying a constraint such as ``don't solve
+dependencies by removing core library packages''. Once this is done,
+the problem can be declared either solvable or unsolvable on the basis
+of a quick analysis of that region of the graph.
+
+In fact, when even relatively basic search techniques are applied to a
+typical dependency problem, the difficulties that arise are related
+not to a paucity of solutions, but rather to an excess of them. That
+is, finding \emph{a} solution is easy, but finding the \emph{right}
+solution is more problematic. Indeed, in the Debian framework there
+is always at least one solution: removing every package on the system
+will satisfy all the dependencies. However, for obvious reasons, this
+is not a solution that we want to produce!
+
+\subsection{Solving Dependencies Through Best-First Search}
+
+I propose the use of a relatively simple algorithm -- best-first
+search -- to resolve dependencies. To briefly review, best-first
+search works by keeping a priority queue, known as the ``open'' queue,
+of potential (or partial) solutions. The priority queue is sorted
+according to some heuristic function that quantifies the ``goodness''
+of each node (often in terms of nearness to a full solution). In each
+iteration of the algorithm, the ``best'' partial solution is removed
+from the queue. If it is a full solution, the algorithm terminates;
+otherwise, each ``successor'' node is generated and placed in the
+queue.
+
+There are two main issues to resolve:
+
+\begin{itemize}
+\item How should successors be generated?
+\item What heuristic should be used?
+\end{itemize}
+
+To generate successors, we could simply enqueue all possible changes
+to a single package. However, this would result in a gigantic
+branching factor (over 1500 branches at each step in the current
+Debian archive), and it would cause the algorithm to consider
+adjusting packages that are utterly irrelevant to the problem at hand,
+as well as changing a package multiple times (which can lead to
+choices being made for reasons that are obscure to the user). A more
+focussed approach is needed.
+
+Similarly, we could simply use the number of currently unsatisfied
+dependencies as our heuristic, but this does not provide any guidance
+as to how dependencies should be resolved. If \pkg{A} depends on
+\pkg{B}, \pkg{A} is installed, and \pkg{B} is not installed, it is
+usually better to install \pkg{B} than to remove \pkg{A}; however, a
+straight count of broken dependencies might prefer either outcome.
+
+\subsubsection{Generating Successors to a Partial Solution}
+
+An obvious way of generating the successors of a given solution is to
+do it on the basis of unsatisfied dependencies. If the installation
+$I$ does not satisfy the dependency $v -> S$, we know that $v$ is
+installed but no member of $S$ is. To resolve this dependency, we can
+either install a different version of $\pkgof{v}$ or install any
+element of $S$. Applying this rule to each ``broken'' dependency in
+turn will produce a set of successors that each solve at least one
+dependency (although they may break others in the process).
+
+However, this approach still has the potential to ``run in circles''
+by installing one version of a package, encountering broken
+dependencies, and then moving to a different version (possibly after
+resolving some dependencies of the intermediate version). The problem
+resolver of \pkg{apt}, for instance, sometimes confuses users by
+exhibiting this behavior. To fix this, I propose a simple rule: a
+solution should never modify a package twice. Formally, if the
+original installation was $I_0$, then for any $I$, the installation
+$I'=I;p \mapsto v$ is only a successor of $I$ if $v \neq
+I_0(\pkgof{v})$ and $I(\pkgof{v})=I_0(\pkgof{v})$.
+
+One might wonder whether this approach risks overlooking solutions:
+for instance, maybe it is really necessary to ``go in circles'' in
+order to find a particular solution. As shown below, if a solution
+cannot be generated by repeatedly using this limited set of
+successors, it is only because there is a ``simpler'' version of the
+solution (one that modifies the states of fewer packages) that is
+generated. To prove this, I first will introduce some definitions and
+notations:
+
+\begin{definition}
+ Let $I_1$, $I_2$ be installations.
+
+ \begin{equation}
+ \idist{I_1}{I_2}=\len{\{p \st I_1(p) \neq I_2(p)\}}
+ \end{equation}
+\end{definition}
+
+\begin{definition}
+ Let $I_1$, $I_2$ be installations. An installation $I'$ is a
+ \emph{hybrid} of $I_1$ and $I_2$ if for all $p$, either
+ $I'(p)=I_1(p)$ or $I'(p)=I_2(p)$.
+\end{definition}
+
+\begin{definition}
+ If $I'$ is a successor of $I$ with respect to $I_0$, then $I
+ \nsol{I_0} I'$. If there exist $I_1, \dots, I_n$ such that $I_1
+ \nsol{I_0} I_2 \nsol{I_0} \dots \nsol{I_0} I_n$, then $I_1
+ \nsolmany{I_0} I_n$.
+\end{definition}
+
+\begin{lemma}
+ Let $I_c$ be any consistent installation (if one exists) and $I_0$
+ be any installation. For all hybrids $I$ of $I_0$ and $I_c$, either
+ $I$ is consistent or there exists a successor $I'$ of $I$ such that
+ $I'$ is a hybrid of $I_0$ and $I_c$, and $\idist{I'}{I_c} <
+ \idist{I}{I_c}$.
+\end{lemma}
+
+\begin{proof}
+ Consider any hybrid $I$ of $I_0$ and $I_c$ such that $I$ is not a
+ solution. Since $I$ is inconsistent, there is a dependency $v->S$
+ such that $I(\pkgof{v})=v$ and for all $v' \in S$, $I(\pkgof{v'})
+ \neq v'$. However, since $I_c$ is consistent, either
+ $I_c(\pkgof{v}) \neq v$ or there exists a $v' \in S$ such that
+ $I_c(\pkgof{v'})=v'$.
+
+ Suppose that $I_c(\pkgof{v}) \neq v$. Since $I$ is a hybrid of
+ $I_0$ and $I_c$, $I_0(\pkgof{v})=v$. Thus, $I \nsol{I_0} I'$, where
+
+ \begin{equation}
+ I'=I;\pkgof{v} \mapsto I_c(\pkgof{v})
+ \end{equation}
+
+ On the other hand, if $I_c(\pkgof{v'})=v'$ for some $v' \in S$, note
+ that as before, $I_0(\pkgof{v'}) \neq v'$. Therefore, $I \nsol{I_0}
+ I'$, where
+
+ \begin{equation}
+ I'=I;\pkgof{v'} \mapsto v'
+ \end{equation}
+
+ In either case, clearly $I'$ is a hybrid of $I_0$ and $I_c$, and
+ $\idist{I'}{I_c}<\idist{I}{I_c}$, proving the lemma.
+\end{proof}
+
+\begin{theorem}
+ For any consistent installation $I_c$ and any inconsistent
+ installation $I_0$, there exists a consistent installation $I_c'$
+ such that $I_c'$ is a hybrid of $I_0$ and $I_c$, and $I_0
+ \nsolmany{I_0} I_c'$.
+\end{theorem}
+
+\begin{proof}
+ Proof is by repeated application of the previous lemma. For each
+ inconsistent hybrid $I$ of $I_0$ and $I_c$, let $I^{+}$ be the
+ hybrid shown to exist in the previous lemma, and define a sequence
+ $I_1, \dots$ as follows:
+
+ \begin{equation}
+ I_k = \begin{cases}
+ I_{k-1} & \text{if $I_k$ is consistent} \\
+ I_k^{+} & \text{otherwise}
+ \end{cases}
+ \end{equation}
+
+ I claim that this sequence converges; i.e., that for some finite $n$
+ and all $m>n$, $I_n=I_m$. Proof: let $D_k=\idist{I_k}{I_c}$ and
+ $n=\idist{I_0}{I_c}$. By the previous lemma, $D_k \geq D_{k+1}$ for
+ all $k$, and $D_k=D_{k+1}$ if and only if $I_k$ is a solution.
+ Thus, if $I_k$ is not a solution, we have $D_k \leq n-k$. But by
+ definition, $D_k \geq 0$ for all $k$, so clearly $I_{n+1}$ is a
+ solution (else we have $0 \leq D_{n+1} \leq -1$).
+
+ Therefore, the theorem holds with $I_c'=D_{n+1}$.
+\end{proof}
+
+\end{document}