[Reproducible-builds] Bug#763822: ftp.debian.org: please include .buildinfo file in the archive

Daniel Kahn Gillmor dkg at fifthhorseman.net
Fri Dec 4 00:59:15 UTC 2015


Hi there!

In https://bugs.debian.org/763822, lunar asked ftp.debian.org to accept
.buildinfo files when they are uploaded with a .changes file.

This is a followup to make the request concrete by specifying how we
hope the archive will sanity-check the included .buildinfo files, and
with a suggestion of how they could be distributed across the mirrors in
a way that will be reasonably convenient for users and downstreams
without making mirror operators crazy.

I'm writing this after discussions with Jelmer, Niels, Lunar, and others
involved in the Reproducible Builds project.

Constraints guiding the suggestions below
-----------------------------------------

We want an archive user to be able to find and fetch all .buildinfo
files that produced a given binary package

We want the eventual possibility of multiple .buildinfo files per
<srcpkg,version,arch>

We understsand that mirror operators don't like small files because
rsync gets fussy with them.

We want both buildds and debian developers to be able to upload
.buildinfo files.


Asks of ftp-master
------------------

We hope that the archive will verify .buildinfo files uploaded by
buildds and DDs or DMs.  We don't expect to require buildds or DDs or
DMs to upload .buildinfo files at this time, though we hope they'll
start to do so once the archive can accept them.

Here's how we think the archive might sanity-check them:

   * There may be 0 or more .buildinfo files included in a .changes
     file.  Each .buildinfo file describes an environment that was used
     to produce some of the binary artifacts (e.g. .deb, .udeb, etc) in
     this upload.

   * To validate each .buildinfo file:

     * ensure that the filename is of the form
       <srcpkgname>_<version>_<arbstring>.buildinfo where:
         * <srcpkgname> matches the source name in the Source: field
         * <version> equals the Version: field
         * <arbstring> is /[-a-z0-9]+/

     * ensure that this filename is not already in the archive.

     * the file should be clearsigned OpenPGP in UTF-8, with nothing
       outside the OpenPGP framing.  It should have a valid signature
       from the same OpenPGP key that signed the .changes file.

     * The signed part of the file must be a valid control file.

     * ensure that Source: and Version: fields in the .buildinfo
       matches the Source: and Version: fields in the .changes file.

     * the .buildinfo must include the same .dsc as the .changes file,
       with the same checksum.

     * in addition, the .buildinfo file should list at least one binary
       artifact.

     * for every binary artifact listed in the .buildinfo file:

        * ensure that it is listed in the .changes file with the same
          checksum(s).  (fwiw, if anyone is concerned about minimizing
          the size of the .buildinfo file, there is no need to include
          the md5 or sha1 checksums of the artifacts. The
          Checksums-Sha256: sub-stanza is sufficient for our purposes)

    If an included .buildinfo file doesn't validate, please reject the
    entire upload.


Once an upload that includes some .buildinfo files is accepted, we want
users to be able to find the .buildinfo(s) for a binary package if they
want to try to reproduce it.

Here's a concrete suggestion for how to do that in a way that might not
make mirror operators sad (if you have a different or preferred
suggestion, that'd be great too):

* collect all .buildinfo files in the archive that produced binary
  artifacts for a given architecture in a tarball named Buildinfos.tgz
  which is distributed alongside Packages.  (for example,
  binary-amd64/Buildinfos.tgz and binary-all/Buildinfos.tgz).

* the structure of the tarball could be
  <srcpkgname>/<version>/<srcpkgname>_<version>_<arbstring>.buildinfo
  (with the same expansions as above)

* the gzip layer of the tarball should be --rsyncable

* When re-creating the Buildinfos.tgz file after some binary artifacts
  have been removed from a suite, any .buildinfo file that only
  references artifacts no longer in the suite can be removed.

Does this seem like a reasonable way to distribute this information?


Clarifications from original bug report
---------------------------------------

In the time since this bug was originally posted, we have a clearer
understanding of what a .buildinfo file represents, and how it can be
used.  For clarity and future documentation, i'll amend/nitpick a few
comments from the original text of the bug report below:

> .buildinfo files would capture from the build environment as much
> information as needed to reproduce the build.

The .buildinfo file *may* contain more information than is needed to
reproduce the build.  The goal is to have it provide enough of a record
of the build environment to be able to reproduce the build, but it's
also possible to include additional, unneeded information, and that's
OK.  (e.g. we are likely to include the exact version number of some
essential packages, even though going from coreutils 8.17-1 to 8.17-2 is
unlikely to affect the build).

>  * A .buildinfo file is generated for each build, and is
>    considered unique for a source package, version, and architecture.
>    A rebuild should always generate the same .buildinfo as
>    the original build.

We no longer think it will need to be unique for this tuple, since it's
possible that multiple different build environments could produce the
same binary artifacts.  A rebuild might therefore produce a different
.buildinfo file, depending on the state of the archive, even if the
produced binary artifacts are identical.

>  * They would be accompanied by detached GnuPG signatures, so multiple
>    parties (e.g. DD and buildd) could assert the production of similar
>    binary packages from the same source and same environment.

We now think that any signatures should be specific to a single
.buildinfo file.

>  * The latter information can then be shown in the Packages index
>    for each binary packages.

We don't think we need to include any reference to the .buildinfo files
in the Packages index.

> During our experiments, adding .buildinfo files to .changes had one
> unforeseen consequence. Packages that used to be “Architecture: all” are
> now “Architecture: all amd64” as .buildinfo are tied to a given build
> architecture. Except that it breaks lintian test suite, it is unclear if
> that's a problem at all, or if some changes should be made. Again, your
> input would be most welcome.

We don't think this is relevant any longer, since the buildinfo file
isn't necessarily tied to the architecture of the build host.  (it may
record the build architecture, but the .buildinfo only really describes
the binary artifacts).


Regards,

        --dkg
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 948 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20151204/c823acc2/attachment.sig>


More information about the Reproducible-builds mailing list