[Reproducible-builds] Bug#807111: libperl-apireference-perl: Make the stored data reproducible between builds
    Niko Tyni 
    ntyni at debian.org
       
    Sat Dec  5 18:10:44 UTC 2015
    
    
  
On Sat, Dec 05, 2015 at 05:32:13PM +0100, Axel Beckert wrote:
> Niko Tyni wrote:
> > This module recently switched to using Sereal::Encoder instead of
> > Data::Dumper to store pre-parsed data. The stored data representation
> > now varies between builds.  The attached patch fixes this, rendering
> > the build reproducible again.
> >    my $dump = Sereal::Encoder->new({
> > +    canonical      => 1,
> I wonder if it's wise to patch the module itself in such a permanent
> way instead of maybe adding a switch and setting canonical=1 only
> during the build or the running of the test suite.
> 
> Maybe users of that module won't be happy if canonical=1 is hardcoded
> that way, e.g. for (guessed) performance reasons as the above likely
> includes sorting which always has an performance impact at some scale.
This code path is in a private function that is only used during the
build (by Perl::APIReference::Generator) to serialize API documentation
structures inline into the module in a __DATA__ section, to avoid parsing
perlapi.pod files at runtime.
I doubt the canonical representation is much slower to decode, but that
phase (API documentation lookups) doesn't seem like a performance critical
thing to me.
A hypothetical performance critical subclass of
Perl::APIReference::Generator might suffer, but IMO this is very
contrived.
The old Data-Dumper based implementation used to set
$Data::Dumper::Sortkeys, so the loss of reproducibility is a regression.
I've also forwarded the patch upstream, so the author can protest
if he judges this loss of performance unacceptable.
I hope this addresses your concerns.
-- 
Niko Tyni   ntyni at debian.org
    
    
More information about the Reproducible-builds
mailing list