[Reproducible-builds] Bug#138409: Bug#138409: dpkg-dev: please add support for .buildinfo files

Guillem Jover guillem at debian.org
Fri Jan 29 16:24:06 UTC 2016


Hi!

On Fri, 2016-01-29 at 16:07:54 +0100, Jérémy Bobbio wrote:
> Guillem Jover:
> > > One of the main change is that `.buildinfo` should now be named with an
> > > arbitrary identifier. By default this defaults to $HOSTNAME-$TIMESTAMP
> > > but can be set to an arbitrary value by the `--buildinfo-identifier`
> > > command line flag.
> > 
> > Hmmm, leaking the hostname seems slightly privacy-concerning? If the
> > information therein is not relevant I'd rather use something like an
> > UUID (although that would require increasing the pseudo-build-essential
> > set), or just hashing the hostname-timestamp with something like md5
> > or sha1 or similar.
> 
> My hunch is that having a timestamp visible in the file name might help
> recognizing files quickly after a bunch of builds, especially to
> identify the last one. So I'd rather keep it.

I see no problem at all with that, and seems like a good idea. I only
take slight issue with the hostname really.

> If privacy is the goal, hashing just the hostname would not be help
> much, as any precomputed table would work.
> 
> How about $TIMESTAMP-$EIGHT_FIRST_CHARS_OF_BUILDINFO_MD5?
> 
> (I'm picking md5 because it's already used in dpkg-gensymbols.)

Yeah md5 is not security sensitive here, so it's good enough. Using
$timestamp-$selfmd5 seems good. And as long as we specify that this
format could change in the future I think we leave room to maneuver.

> > Can we just simply use the package name rules instead? It also avoids
> > potential problems with case and similar. (There's a
> > pkg_name_is_illegal function in Dpkg::Package already.)
> 
> Sounds reasonable. I've updated the wiki page and prepared a patch for
> dak.

Thanks!

> > > +    } else {
> > > +        warning(_g('no .dsc file, skipping .buildinfo generation'));
> > > +    }
> > >  }
> >
> > ISTR mentioning this before, but I don't see why generating the
> > buildinfo file is tied to existing a source package at all? The source
> > should be included if we are including a source in the upload, that's
> > it.
> 
> The whole puprose of the reproducible builds effort is to provide a
> verifiable path from sources to binaries. Signed .buildinfo files are
> certifications that the listed binary packages have been built using the
> described source and environment.
> 
> Only listing the source in .buildinfo when a source is included in the
> upload would prevent us to have multiple builders certify the same
> binaries. That would cut us from providing multiple certifications and
> would undermine the purpose of reproducible builds.
> 
> So I could remove the limitation, but the resulting .buildinfo file
> would not be very useful for reproducible builds.

I understand why you need the chain, but don't forget that this quite old
bug report was about including build information in uploads, which seems
to me is a superset of the needs of the reproducibility project! So I've
to take into account the general case here too. :)

Also I kind of fail to see a problem here. With the current implementation
when there is no source, the .buildinfo file is not generated at all. It
seems to me switching to generating it even when there is no source is no
worse (it's actually better!) than not including a .buildinfo at all.

> > > +$fields->{'Source'} = $spackage;
> > > +if ($changelog->{'Binary-Only'}) {
> > > +    $fields->{'Source'} .= ' (' . $sourceversion . ')';
> > > +    $fields->{'Changes'} = $changelog->{'Changes'} . "\n\n"
> > > +                         . ' -- ' . $changelog->{'Maintainer'}
> > > +                         . '  ' . $changelog->{'Date'};
> > > +}
> > 
> > Hmmm, it bothers me slightly that the Changes field here diverges in
> > form from the one in the .changes file.
> 
> I can understand. It's been designed that way because it's actually only
> there for binNMUs where the source is the same as the original and we
> need a way to reconstruct the right changelog file.
> 
> Because sbuild might actually change its strings in the future, it felt
> like plain copy/pasting was the safest. So recreating the changelog in
> case of binNMU is about outputing the value of the Changes field in the
> .buildinfo, a blank line, and the changelog from the original source.

Sure.

> > I think I'd prefer to have the Date as its own field, maybe always
> > included. And also the Maintainer field. Any reason to not include
> > them all the time or on their own?
> 
> I think they would be confusing. If we would to include the “Maintainer”
> I guess we should call it “Changed-By” like in .changes. “Date”
> as such would be a confusing name because I would tend to think of it
> as the date of the build, and not as the date of the latest changelog
> of a binNMU… So maybe “Changed-On” or “Change-Date”.
> 
> But this feels just more complicated than just the current
> implementation, even if the format differs slightly. Maybe we can rename
> that field instead to “Extra-Changelog-Entry” or something else so it's
> clear they have different format?

Actually yes, naming it just Changes is pretty misleading, as I'd
expect to see it even on non-binNMU builds. Naming it instead
something like Binary-Only-Changes (to somewhat match the Binary-Only
field) would be better IMO (or can you come up with a better name?). With
that I'd be fine with having it contain the entire changelog entry for the
binNMU.

> > > +my $environment = Dpkg::Deps::AND->new();
> > > +foreach my $pkg (sort keys %env_pkgs) {
> > > +    foreach my $installed_pkg (@{$facts->{pkg}->{$pkg}}) {
> > > +        my $version = $installed_pkg->{version};
> > > +        my $architecture = $installed_pkg->{architecture};
> > > +        my $pkg_name = $pkg;
> > 
> > > +        if (defined $architecture &&
> > > +            $architecture ne 'all' && $architecture ne $build_arch) {
> > > +            $pkg_name = "$pkg_name:$architecture";
> > > +        }
> > […]
> > Also this will include all Multi-Arch instances for a given package
> > regardless of them being used or not, I don't think that's desirable.
> 
> Can we know for sure which one have been used?

In principle yes, the ones that would be needed to satisfy the
build dependencies, which is partially what dpkg-checkbuilddeps is
doing.

> I'm already working on other changes you suggested.

Ah perfect.

Oh and had completely forgotten, could you please also add a new
deb-buildinfo(5) man page describing the format of the file? I really
want all file formats supported by dpkg to be documented here for
external parties to refer to.

Thanks,
Guillem



More information about the Reproducible-builds mailing list