[Soc-coordination] Bootstrappable Debian - Report 3 + N + 1

P. J. McDermott pjm at nac.net
Tue Jul 31 07:23:41 UTC 2012


This is a report for the "Bootstrappable Debian" project [1][2] mentored
by Wookey and co-mentored by Jonathan Austin.

There seems to be some confusion about the ordinal number of this week's
report; the schedule numbers it fifth, but the previous report was the
third.  So I've numbered mine 3 + N + 1, where N is either 0 or 1
according to reader preference. :)

Copies of this report are sent to the debian-bootstrap [3] and
debian-embedded lists.

My first three reports were organized primarily into "Work Done" and
"Next Steps".  I've organized this one primarily by the areas of work
I've done and plan to do yet.

[1]: http://wiki.debian.org/SummerOfCode2012/Projects#Bootstrappable_Debian
[2]: http://wiki.debian.org/SummerOfCode2012/StudentApplications/PJMcDermott
[3]: http://lists.mister-muffin.de/cgi-bin/mailman/listinfo/debian-bootstrap


Tools
=====

A number of package build tools (especially dpkg and sbuild, for our
testing purposes) need to be modified to support reduced dependency
information in those source packages that will be buildable in "staged"
forms.

dpkg
----

Back in May, Guillem Jover suggested [4] an alternative to
"Build-Depends-Stage1", etc. fields.  He referred to a document [5] he'd
written that proposes methods of conditionally reducing build
dependencies.  Two such methods are "build profiles" (with fields like
"Build-Depends: huge (>= 1.0) [i386 arm] <!embedded !bootstrap>, tiny")
and "purpose overrides" (with fields like "Build-Depends[embedded
bootstrap]: tiny").

On 2012-07-10, Wookey wrote [6] that purpose overrides "seems a lot
nicer than adding lots of Build-Depends-StageN fields".  Purpose
overrides would offer the same functionality that the previously
considered new fields would but also offer a more generic syntax.

Guillem then noted [7] that he meant to propose build profiles rather
than purpose overrides.  The latter proposal, as do the
Build-Depends-StageN fields, involves the duplication of build
dependency lists.

Out of curiosity, I implemented [8][9] build profiles in dpkg and found
profile specification parsing quite simple to add (much simpler than
pattern-based fields were to add – I had dpkg-checkbuilddeps handling
profiles in only about half an hour).

The patch makes dpkg-checkbuilddeps and Dpkg::Deps parse and reduce
build profiles (e.g. "<!stage1 !nodocs>") and makes dpkg-gencontrol add
a new "Build-Profile" field to packages built with a profile.

The interfaces for specifying a build profile are modelled after those
for specifying a host architecture.  They include a "-P<profile>" option
to dpkg-buildpackage and dpkg-checkbuilddeps and a "DEB_BUILD_PROFILE"
environment variable to be used by debian/rules, dpkg-checkbuilddeps,
and dpkg-gencontrol.

Wookey, Johannes Schauer, and I agree that build profiles appear to be
the best solution, but we're not yet sure if all of the staged
dependency information can be easily expressed in the profiles syntax.
So I wrote a Perl script called dpkg-prof2purp.pl [10] to convert build
dependency fields in a debian/control file from the profiles syntax to
the purpose overrides one, should we eventually need to convert my work.

I've been testing my build profiles patch against packages (see
dependency removal work below) and will continue to do so.  Pending
further testing and a future discussion on the debian-devel mailing
list, this patch should eventually be included in dpkg.  Unfortunately
it seems we missed the chance to get support for staged packages into
Debian wheezy.

Many other programs will have to be modified to support profile
specifications, including sbuild (see below) and APT.

[4]: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661538#51
[5]: http://www.hadrons.org/~guillem/debian/docs/embedded.proposal
[6]: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661538#126
[7]: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661538#131
[8]: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=661538#136
[9]: http://bootstrap.pehjota.net/dpkg/dpkg-build-profiles.patch
[10]: http://bootstrap.pehjota.net/dpkg/dpkg-prof2purp.pl

sbuild
------

I patched [11] sbuild to support build profiles.  Cross building
packages that have profile specifications won't work yet however,
because sbuild uses APT to get the list of cross build dependencies and
APT doesn't yet understand profile specifications.

With this patch, the user specifies a profile to build using the
"--profile" option.  Wookey and I decided that sbuild should also obey
the environment variable that specifies the profile.  I plan to add that
behavior to the patch.

As with the dpkg one, this patch will be submitted for inclusion in
sbuild pending further testing.

[11]: http://bootstrap.pehjota.net/sbuild/sbuild-build-profiles.debdiff

Workflow Assistant Script
-------------------------

I briefly mentioned in my previous report that I had begun working on a
script written in shell command language to partially automate my cross
building workflow.  I since realized that it can have the same effect on
my dependency cycle breaking workflow.

I spent a few more hours on the script to "finish" it.  It could be
improved, but it works well enough for my needs so far and makes my work
less monotonous.

It allows me to easily:

  * Download Debian source packages (for sid – or any distribution I
    configure – even though I have only wheezy in my APT sources),
  * Unpack downloaded source packages for editing,
  * Build edited source packages (and patches generated by debdiff),
  * Build binary packages from the modified source packages (with a
    configurable build command), and
  * Upload all my work to my server.

The script may be found in a Git repository [12][13] on my server.  It
lacks documentation for now, so anyone interested can just ask me any
questions about it they might have.

[12]: http://odin1.pehjota.net/git/deb/
[13]: git://odin1.pehjota.net/deb.git

Dependency Cycle Breaking
=========================

The core of this project is to break build dependency cycles – to remove
edges (dependencies) and vertices (packages) from the cyclic dependency
graph.

Strategies
----------

As Johannes Schauer mentioned in his report [14], we have a number of
strategies for removing edges (and hopefully vertices) from the cyclic
dependency graph.

I previously focused on cross building, which makes use of packages
already built for the build architecture to break dependency cycles and
build packages for a host architecture.  See my previous report [15] for
more information on my work to cross build the base build system.

At the advice of Wookey, I've moved away from cross building for now and
focused instead on staged building – the building of packages with
reduced build dependencies.  I do this by adding negative profile
specifications such as "<!stage1 !nodocs>" to the "Build-Depends" fields
of source packages and building the packages in the "stage1" profile
using my patched sbuild and dpkg packages.

Johannes suggested [16] that some dependency cycles could be avoided
simply by ignoring "Build-Depends-Indep" fields in packages, since
architecture-independent packages need not be built when bootstrapping a
Debian architecture.  Wookey explained [17] that Build-Depends-Indep is
still a relatively new feature in source packages and in build tools.  I
observed that recent versions of sbuild correctly handle the field; and
Wookey, Johannes, and I all agreed [18] that moving dependencies from
Build-Depends into Build-Depends-Indep is a useful strategy for breaking
dependency cycles.

[14]:
http://lists.alioth.debian.org/pipermail/soc-coordination/2012-July/001322.html
[15]:
http://lists.alioth.debian.org/pipermail/soc-coordination/2012-July/001301.html
[16]:
http://lists.mister-muffin.de/pipermail/debian-bootstrap/2012-July/000284.html
[17]:
http://lists.mister-muffin.de/pipermail/debian-bootstrap/2012-July/000286.html
[18]:
http://lists.mister-muffin.de/pipermail/debian-bootstrap/2012-July/000294.html

Finding Dependencies to Remove
------------------------------

I started out by attempting to remove dependencies that occur in the
greatest number of cycles.  As Johannes noted on IRC, this is
inefficient because there are billions of dependency cycles in Debian
sid.

Going forward, I will look at:

  * Vertices (packages) in the cyclic dependency graph with the smallest
    degrees, since removing the edges (dependencies) connected to them
    can easily remove entire vertices from the graph, and
  * Small cycles, since these contain dependencies that must be removed
    and breaking small cycles can simultaneously break some larger
    cycles (while the reverse is not true).

Removing Dependencies
---------------------

I've so far removed 27 build dependencies from 10 source packages [19]:
atk1.0, atlas, avahi, dbus, gnutls26 krb5, libgcrypt11, libtasn1-3,
poppler, and qt4-x11.

I'll summarize my work rather than explain the changes I made to each
package.

In atlas, dbus, gnutls26, and libgcrypt11, I moved build dependencies
into Build-Depends-Indep.  (I have not yet been able to fully test my
changes to atlas, as the software takes a long time to build and the
previous build was interrupted.)

I added a "stage1" build profile to avahi, krb5, and qt4-x11.  I added
"stage1" and "nodocs" profiles to atk1.0, libtasn1-3, and poppler.  I
was even able to remove an unused dependency on libgtk2.0-dev from
poppler.

Find on my server patches [19] and build logs [20] for all source
packages.  (So far there's no patch for qt4-x11, because debdiff was
unable to unpack the large original tar archive in my small /tmp
filesystem.)  These files are managed by the workflow assistant script I
mentioned above.

I'm gradually submitting bug reports [21] with patches to move build
dependencies to Build-Depends-Indep and remove any unused dependencies.

[19]: http://bootstrap.pehjota.net/staged/pkgs/
[20]: http://bootstrap.pehjota.net/staged/builds/
[21]:
http://bugs.debian.org/cgi-bin/pkgreport.cgi?users=bootstrap@debian.org&tag=bootstrap

"Sources" File Patch and Weak Build Dependencies List
-----------------------------------------------------

At Johannes's request, I began maintaining [22]:

  * A patch [23] (automatically generated [24] twice a day) against the
    Debian sid "Sources" file to add my changes to Build-Depends and
    Build-Depends-Indep fields and
  * A list [25] of "weak" build dependencies (binary packages used by
    source packages for non-critical things like documentation
    generation), which I will eventually push into Johannes's bootstrap
    tools Git repository [26].

[22]:
http://lists.mister-muffin.de/pipermail/debian-bootstrap/2012-July/000309.html
[23]: http://bootstrap.pehjota.net/staged/sources/Sources.patch
[24]: http://bootstrap.pehjota.net/staged/sources/patchsources.sh
[25]: http://bootstrap.pehjota.net/staged/notes/weak-build-deps.txt
[26]: https://gitorious.org/debian-bootstrap/bootstrap

Schedule
========

In my original project schedule [27], I overestimated the amount of work
I needed to do myself.

My progress was rather slow in the first half of the program in part
because I underestimated the scope and complexity of some tasks (such as
patching package build tools).

I did the cross and staged building tasks in the order opposite that
which I had originally planned.  Additionally, I had planned to do more
work than can reasonably be accomplished in one summer; I've had to put
aside my cross building work to focus on staged building (in some ways
the most important part of this project).

I plan to (more casually) continue after the summer to help make more
Debian packages support cross building.

[27]:
http://wiki.debian.org/SummerOfCode2012/StudentApplications/PJMcDermott#Project_Schedule


-- 
P. J. McDermott                                        (_/@\_)    ,--.
http://www.pehjota.net/                           o    < o o >   / oo \
http://www.pehjota.net/contact.html                 o   \ `-/    | <> |.
                                                o o o    "~v    /_\--/_/



More information about the Soc-coordination mailing list