[Reproducible-builds] introducing buildinfo2snapshot

Johannes Schauer j.schauer at email.de
Wed Dec 31 15:42:04 UTC 2014


today I skimmed the ReproducibleBuilds wiki page and read that "A build tool
that would reproduce a build environment using packages from
snapshot.debian.org is still missing.".

As I understand it, this task mainly boils down to finding a snapshot that all
package versions in the .buildinfo file have been retrieved from. The following
Expat licensed Python script tries to figure this out:


This is just a proof of concept. If what I'm doing there is found to be sane,
I'll clean this script up and maybe it could be integrated in a package like

The script essentially calls the same snapshot.d.n API functions as are used by
debsnap(1) but instead of downloading the binary packages, it notes the
timestamp on which the version was first seen. After having done this for all
binary packages in the .buildinfo file, it picks the most recent of these
timestamps and downloads the Packages.gz of that timestamp from snapshot.d.o
and verifies that this Packages file indeed contains all requested versions.


 1. the only suite the script will search in right now is "sid". There seems to
    be no facility to find out which suite a binary package comes from using
    the snapshot.d.o API. I reported this as bug #774279. In the future it
    should be possible to also find snapshots in "testing" or "stable" but
    since buildds build using sid, this seemed to be a good first target.
 2. the only archive area supported is "main"
 3. if not all binary packages are found in a single snapshot, then an error is

All three problems above lead to the problem of: what to do if the package
versions specified in the .buildinfo file cannot be found in a single snapshot?
This is a problem because debootstrap (which is used to setup the initial
chroot by sbuild and pbuilder) only understands a single mirror URL.  The
solution would be to either fix debootstrap or using multistrap which uses apt
and hence can handle any number of mirror URLs. The latter solution is probably
the cleaner and more flexible one. Multistrap also allows to specify the exact
package version to install (as it just hands this over to apt).

So solving the task "A build tool that would reproduce a build environment
using packages from snapshot.debian.org" would require:

 1. Extending buildinfo2snapshot such that instead of throwing an error in case
    not all package versions can be found in a single snapshot, it would just
    return a list of snapshot timestamps which together contain all versions
 2. Adapting the tools which are used to create sbuild and pbuilder chroot to
    use multistrap and allow:
       2.1. Multiple mirror URIs
       2.2. A list of versioned binary packages

Does this sound like a plan?

An alternative mode of operation of buildinfo2snapshot would be to bisect the
snapshot archive until a snapshot is found that contains the requested binary
package versions. This would have the advantage that the additional information
about the versions at more timestamps would allow to calculate a smaller final
selection of mirror URLs than the last_seen attribute returned by the API
together with missing suite information (see bug #774279) allows. The
disadvantage would be that downloading multiple Packages files is probably less
nice than using the API, bandwidth wise.

cheers, josch

More information about the Reproducible-builds mailing list