Buildinfo in the Debian archive, updates

Ximin Luo infinity0 at debian.org
Mon Nov 14 14:57:00 UTC 2016


This email is a summary of some discussions that happened after the last post
to bug #763822, plus some more of my own thoughts and reasoning on the topic.

I think having the Debian FTP archive distribute unsigned buildinfo files is an
OK intermediate solution, with a few tweaks:

1. the hashes of the *signed* buildinfo files must be referred-to for each
   binary package, in Packages.gz

2. the unsigned buildinfo files should be grouped into per-architecture
   "Buildinfos-$arch.xz" files rather than one file for the whole archive.

The first point is necessary to security, whereas the second point is an
optional performance enhancement. I'll try to explain these below, plus some
extra optional things we could also do.

----

There is a temptation to think that 3rd-party buildinfo services (e.g.
buildinfo.debian.net) mean that the Debian FTP archive doesn't have to
distribute buildinfo files at all, nor worry about them. However, this is not
the case; the *least* that the archive must do, for security, is (1) above.

So let's get this out of the way first. Why signed - isn't the hash of the
unsigned buildinfo file enough?

No - buildinfo files help someone to reproduce a binary. However it is
important to consider the negative case as well. What happens if multiple
people download a buildinfo file for a Debian package, and *cannot* perform the
reproduction that the buildinfo file describes?

In this case, we want to be able to trace who originally performed the build.
It does not necessarily mean that that person is guilty of anything, maybe
their computer got hacked or something, but it gives us a vital starting point
for investigations.

Storing the hash of the signed buildinfo file, allows us to retrieve that file
from a 3rd-party service and re-verify the signature (it should have already
been verified before being accepted into the archive), and continue an
investigation on this basis.

To be clear: it is not enough to retrieve a signed buildinfo file whose
*unsigned* contents match a hash of the unsigned original file - somebody else
in theory *could* have generated the same buildinfo file and signed it
themselves, but this has nothing to do with identical binaries being uploaded
to the Debian FTP archives; and they should not take the blame for that.

This minimal solution still depends on 3rd-party services being available to
host the files. A much better solution is for the FTP archive to *also* store
the signed buildinfo files, somewhere safe that can be recovered when needed.
These don't have to be distributed on the mirror network (since people can get
them from the other 3rd-party services), but they should be on infrastructure
that is "run by" same people that control the archive keys, for their own
auditing benefit.

----

Now then, why does the FTP archive need to distribute buildinfo files at all?
It can simply store the signed files and distribute the hashes. Then rebuilders
(people that want to verify our reproducibility claims) can download the hashes
from the archive, get the corresponding buildinfo files from another server,
and perform the build. The files could even be unsigned, this does not matter
for rebuilding purposes.

This is a slightly awkward workflow however and it would be simpler / more
reliable to only have to contact one host. Furthermore, most rebuilders would
probably only try to build for one architecture, hence it is again a nicer
workflow to only download the required buildinfos for your own architecture.

We also ran some numbers and a Buildinfos-amd64.xz (with unsigned buildinfo
files) turned out to be about 9MB which I think is reasonable to expect people
to download periodically, whereas a Buildinfos.xz across all arches would
probably be more like 50MB or more (we don't have the machines to properly
calculate this) and is less convenient both for rebuilders and for the archive
mirror network.

With signatures, the number is much much greater and not really suitable for
continual distribution, which is why these have to be unsigned.

As I mentioned, these are optimisations for rebuilders and not strictly
necessary for security.

----

If the mirror network is able to cope, I still think it would be good to
distribute a "$srcpkg.buildinfos.xz" with the signed buildinfos for each source
package. This places more strain on the mirror network, but it does not seem
too unreasonable to me. Confirming this would need more time and thought to
answer however.

The benefit would be added convenience for users and verifiers - they don't
have to contact a 3rd-party service to get the signatures, to verify them.

----

If there are no objections to what I said above, I can forward this onto the
bug report #763822 as well.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



More information about the Reproducible-builds mailing list