[Reproducible-builds] .buildinfo should contain source hashes (as well as binary hashes)

Ximin Luo infinity0 at debian.org
Mon Sep 21 10:29:59 UTC 2015


On 20/09/15 20:43, Jérémy Bobbio wrote:
> Ximin Luo:
>> With our current .buildinfo setup, the above process is more
>> complicated, because we *only* store hashes of the binary build
>> environment.
> 
> [..]
> 
> The idea to put a hash of the binary package in the
> Build-Environment is a late addition to the original idea. 
> 

Sure, I realised after I posted that the binary hashes hadn't been implemented yet. That's a side issue though.

> In any cases, we currently don't have code to store any hash of the
> Build-Environment. If we wanted to store hashes of binary packages, then
> we would need to have them in /var/lib/dpkg/status and it's not done
> yet, even if Guillem said this would be a good thing to have.
> 

`apt-cache show [pkg]` will list hashes of binaries. Is there some reason we can't just do this?

>> Currently, to run a DDC test, we would have to read the buildinfo
>> file, find the hashes of the binary build-deps, lookup the source
>> packages that corresponds to these hashes, find a different binary
>> build-deps for these hashes, and run our DDC-checker. This takes many
>> round trips, and contacting external infrastructure that isn't
>> necessary.
> 
> You would not need to lookup the source packages using hashes. Using
> package and version gives you enough info to retrieve a specific source
> package from the archive.
> 
>> If .buildinfo files contained source hashes, the DDC-checker could
>> work more directly, without requiring a remote repository of source
>> hash <-> binary hash mappings.
> 
> I'm interested in `.buildinfo` in the context of the Debian project. The
> Debian archive is designed to be immutable. A specific version of a
> package will always correspond to the same source and binary files.
> So I don't see why one would do complex “source hash - binary hash
> mapping” when you can just rely on what is in the archive (and what has
> been archived by snapshot.debian.org).
> 

It's a good principle to design something to rely on the least amount of external infrastructure as possible. Just because we already depend on some infrastructure, doesn't mean we need to add more dependencies to it.

Suppose someone did a source-only mirror in the future, because binaries are too costly to store. Then, the .buildinfo files (with source hashes) can still be used against this mirror.

The "intuitive meaning" that we would like a .buildinfo file to have, is to describe immutably the input and the output. For testing and verification purposes, the input is the *source code* of the build-deps and of the target.

Getting reproducible builds to work is IMO fixing a massive bug that has existed for decades. Normally, when you run a fixed program against fixed input, what do you expect? Fixed output. Binary-hash-only .buildinfo files would only help to prove that this bug doesn't exist. *But that's not an incredible achievement.* Great, f(x) == g(y) when f == g and x == y, whoopee? We should aim higher, to be able to generate fixed-binary proofs for when only the source code (and not necessarily the binaries) matches.

> If by building thing that ought to match a specific package version you
> get different result, then you will have to investigate in any cases.
> 
> 
> Implementation-wise, getting the hash of the .dsc in the .buildinfo is
> going to be very tricky. dpkg does not know about what's available in
> the archive. It just knows about packages which are or were installed.
> 

`apt-cache showsrc [pkg]` has the right information in there, but it's a bit messy. I need to test this without a deb-src line, though.

X

-- 
GPG: 4096R/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20150921/04e42853/attachment.sig>


More information about the Reproducible-builds mailing list