[Reproducible-builds] Bug#138409: dpkg-dev: please add support for .buildinfo files

Jérémy Bobbio lunar at debian.org
Fri Jan 29 15:07:54 UTC 2016


Guillem Jover:
> > One of the main change is that `.buildinfo` should now be named with an
> > arbitrary identifier. By default this defaults to $HOSTNAME-$TIMESTAMP
> > but can be set to an arbitrary value by the `--buildinfo-identifier`
> > command line flag.
> 
> Hmmm, leaking the hostname seems slightly privacy-concerning? If the
> information therein is not relevant I'd rather use something like an
> UUID (although that would require increasing the pseudo-build-essential
> set), or just hashing the hostname-timestamp with something like md5
> or sha1 or similar.

My hunch is that having a timestamp visible in the file name might help
recognizing files quickly after a bunch of builds, especially to
identify the last one. So I'd rather keep it.

If privacy is the goal, hashing just the hostname would not be help
much, as any precomputed table would work.

How about $TIMESTAMP-$EIGHT_FIRST_CHARS_OF_BUILDINFO_MD5?

(I'm picking md5 because it's already used in dpkg-gensymbols.)

> Can we just simply use the package name rules instead? It also avoids
> potential problems with case and similar. (There's a
> pkg_name_is_illegal function in Dpkg::Package already.)

Sounds reasonable. I've updated the wiki page and prepared a patch for
dak.

> > +    } else {
> > +        warning(_g('no .dsc file, skipping .buildinfo generation'));
> > +    }
> >  }
>
> ISTR mentioning this before, but I don't see why generating the
> buildinfo file is tied to existing a source package at all? The source
> should be included if we are including a source in the upload, that's
> it.

The whole puprose of the reproducible builds effort is to provide a
verifiable path from sources to binaries. Signed .buildinfo files are
certifications that the listed binary packages have been built using the
described source and environment.

Only listing the source in .buildinfo when a source is included in the
upload would prevent us to have multiple builders certify the same
binaries. That would cut us from providing multiple certifications and
would undermine the purpose of reproducible builds.

So I could remove the limitation, but the resulting .buildinfo file
would not be very useful for reproducible builds.

> > +$fields->{'Source'} = $spackage;
> > +if ($changelog->{'Binary-Only'}) {
> > +    $fields->{'Source'} .= ' (' . $sourceversion . ')';
> > +    $fields->{'Changes'} = $changelog->{'Changes'} . "\n\n"
> > +                         . ' -- ' . $changelog->{'Maintainer'}
> > +                         . '  ' . $changelog->{'Date'};
> > +}
> 
> Hmmm, it bothers me slightly that the Changes field here diverges in
> form from the one in the .changes file.

I can understand. It's been designed that way because it's actually only
there for binNMUs where the source is the same as the original and we
need a way to reconstruct the right changelog file.

Because sbuild might actually change its strings in the future, it felt
like plain copy/pasting was the safest. So recreating the changelog in
case of binNMU is about outputing the value of the Changes field in the
.buildinfo, a blank line, and the changelog from the original source.

> I think I'd prefer to have the Date as its own field, maybe always
> included. And also the Maintainer field. Any reason to not include
> them all the time or on their own?

I think they would be confusing. If we would to include the “Maintainer”
I guess we should call it “Changed-By” like in .changes. “Date”
as such would be a confusing name because I would tend to think of it
as the date of the build, and not as the date of the latest changelog
of a binNMU… So maybe “Changed-On” or “Change-Date”.

But this feels just more complicated than just the current
implementation, even if the format differs slightly. Maybe we can rename
that field instead to “Extra-Changelog-Entry” or something else so it's
clear they have different format?

> > +my $environment = Dpkg::Deps::AND->new();
> > +foreach my $pkg (sort keys %env_pkgs) {
> > +    foreach my $installed_pkg (@{$facts->{pkg}->{$pkg}}) {
> > +        my $version = $installed_pkg->{version};
> > +        my $architecture = $installed_pkg->{architecture};
> > +        my $pkg_name = $pkg;
> 
> > +        if (defined $architecture &&
> > +            $architecture ne 'all' && $architecture ne $build_arch) {
> > +            $pkg_name = "$pkg_name:$architecture";
> > +        }
> […]
> Also this will include all Multi-Arch instances for a given package
> regardless of them being used or not, I don't think that's desirable.

Can we know for sure which one have been used?


I'm already working on other changes you suggested.

Thanks,
-- 
Lunar                                .''`. 
lunar at debian.org                    : :Ⓐ  :  # apt-get install anarchism
                                    `. `'` 
                                      `-   
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.alioth.debian.org/pipermail/reproducible-builds/attachments/20160129/e605a2b2/attachment.sig>


More information about the Reproducible-builds mailing list