[Reproducible-builds] Preliminary review of dpkg-genbuildinfo

Daniel Kahn Gillmor dkg at fifthhorseman.net
Thu Feb 12 16:45:25 UTC 2015


On Fri 2015-02-06 01:13:18 -0500, Guillem Jover wrote:
> Take the example I gave previously of a binary package detached from
> an archive, just a .deb package laying around, either from an old
> download or passed to you by someone. You have to *know* the origin of
> the binary, otherwise you need to first start hunting down where this
> binary was built, say Debian, one of its derivatives or even somewhere
> else. And sure, once that's known, the user *might* possibly be able to
> reproduce the build, but I don't see many (if not most) users being able
> or willing to set up a reproducible build environment just to verify
> where a binary was coming from (say my relatives). If you cannot or
> wont do that, you need to trust the distribution, the remote server,
> the network, the remote binary including any possible reproducible
> information being correct. At that point you or a program might as well
> have just verified an embedded signature.

I see per-package signatures and reproducible builds as complementary to
one another.  I'd be happy to brainstorm ways that we can make sure we
can get the advantages of both.

I understand the goals of per-package signatures, and that it's more
convenient for most folks to have a single file than to have to schlep
around multiple files to be able to verify the package.

I also understand the ecological benefits of reproducibility, especially
for a project like debian that has so many heterogenous downstreams that
pull (directly or indirectly) from it.

For one of the big reproducibility benefits -- detecting compromised
build environments -- to accrue to everyone, we only need a handful of
independent reproducers who can't all be compromised together, and a
straightforward way that people can compare the results and raise an
alarm if something looks fishy.

The benefits of reproducibility come from *corroboration*.  otoh, the
usual view of embedded package signatures come from *single authority*
("the origin of the binary").  For the broader ecosystem of debian + its
derivatives, where some binary packages might be shipped by multiple
vendors, it is actually useful to users of a derivative that doesn't
rebuild everything to be able to say with confidence "this package is
exactly the same as the packages debian ships".  At the same time, as a
user of the derivative, you'd also like to be able to know that any
given package is "endorsed" or included by your distro, so that
arbitrary packages from debian proper can't just be slipped by your
package manager without knowing about it.

So there's this interplay (tension?) between authority of origin and
corroboration that i suspect we'd all like to support.

I've been partial to the separate .buildinfo file specifically because i
think i can see the path where we can get both corroboration and
specific authority from it (a single .buildinfo file can be signed by
multiple authorities), while the identical .deb in two distros can be
bitwise-compared using simple tools.

That said, this clearly doesn't address the convenience/portability
win of embedded package signatures raised by Guillem.

As a step toward brainstorming, here's a way that we could (maybe?) get
both:

 * we could make a pkg-fingerprint tool that takes a package and
   produces a cryptographically-strong fingerprint of the contents of
   the package *minus* the signature elements.

   As long as the package format has no internal signature, this would
   just be something like sha512sum over the entire package.  When we
   have a package with an embedded-signature format, we'd need to define
   a way to strip any/all signatures from the package in a reproducible
   way that does not touch the rest of the contents of the package.
   then pkg-fingerprint becomes something like:

     pkg-strip-sigs < foo.deb | sha512sum

   and the comparison between packages moves from /usr/bin/cmp to
   comparing the outputs of pkg-fingerprint (or we could make a pkg-cmp
   tool).

   the buildinfo files would then store the pkg-fingerprint output over
   the .deb or .rpm binaries that they produced.

This is a little bit inelegant, but maybe it's a way that we can have
our cake and eat it too?  I'd love to hear other suggestions.

    --dkg



More information about the Reproducible-builds mailing list