Bug#859103: strip-nondeterminism: does not replace all timestamps in zip archives
Benjamin Moody
benjamin.moody at gmail.com
Tue Mar 27 23:02:28 UTC 2018
Package: strip-nondeterminism
Version: 0.034-1
Followup-For: Bug #859103
This is especially annoying because the "local extra field" includes
the file *access* time:
$ rm -f foo 1.zip 2.zip
$ touch -d 2015-01-01 foo
$ zip 1.zip foo
$ zip 2.zip foo
$ strip-nondeterminism 1.zip 2.zip
$ diffoscope 1.zip 2.zip
...
00000010: 0000 0000 0000 0000 0000 0300 1c00 666f ..............fo
-00000020: 6f55 5409 0003 50d4 a454 50d4 a454 7578 oUT...P..TP..Tux
+00000020: 6f55 5409 0003 50d4 a454 62a1 ba5a 7578 oUT...P..Tb..Zux
00000030: 0b00 0104 e803 0000 04e8 0300 0050 4b01 .............PK.
...
(Which makes me think, for testing build reproducibility, it'd be
wise to try one build using noatime and another using
strictatime. But anyway...)
The problem here is that Archive::Zip::ZipFileMember is not
designed to allow modifying the localExtraField() at all. From
the man page:
localExtraField( [ $newField ] )
localExtraField( [ { field => $newField } ] )
Gets or sets the extra field that was read from the local header.
This is not set for a member from a zip file until after the member
has been written out. The extra field must be in the proper format.
That is to say, before calling $zip->overwrite(),
localExtraField() returns an empty string. Moreover, for
ZipFileMembers, manually setting the field has no effect - even
if you call overwrite(), then go back and modify the
localExtraFields, then call overwrite() again, it will re-read
the fields from the zip file.
(As far as I can tell, the *only* way to manually specify a
localExtraField using Archive::Zip is to decompress and
recompress each member.)
Here is a rather kludgy patch to make Archive::Zip behave the way
that strip-nondeterminism seems to expect:
--- /usr/share/perl5/Archive/Zip/ZipFileMember.pm
+++ Archive/Zip/ZipFileMember.pm
@@ -43,6 +43,25 @@
and $self->uncompressedSize == 0);
}
+sub localExtraField {
+ my $self = shift;
+
+ # If this function is called with an argument, it overrides the
+ # original field contents from the source archive.
+ if (@_) {
+ $self->{'_localExtraFieldUserDefined'} = 1;
+ }
+ # Otherwise, the value is loaded lazily, the first time it is needed.
+ elsif (!defined $self->{'_localExtraFieldUserDefined'}
+ and defined $self->{'externalFileName'}) {
+ my $origpos = $self->fh()->tell();
+ $self->rewindData();
+ $self->fh()->seek($origpos, IO::Seekable::SEEK_SET);
+ }
+
+ return $self->SUPER::localExtraField(@_);
+}
+
# Seek to the beginning of the local header, just past the signature.
# Verify that the local header signature is in fact correct.
# Update the localHeaderRelativeOffset if necessary by adding the possibleEocdOffset.
@@ -156,10 +175,17 @@
}
if ($extraFieldLength) {
- $bytesRead =
- $self->fh()->read($self->{'localExtraField'}, $extraFieldLength);
- if ($bytesRead != $extraFieldLength) {
- return _ioError("reading local extra field");
+ if ($self->{'_localExtraFieldUserDefined'}) {
+ $self->fh()->seek($extraFieldLength, IO::Seekable::SEEK_CUR)
+ or return _ioError("skipping local extra field");
+ }
+ else {
+ $bytesRead =
+ $self->fh()->read($self->{'localExtraField'}, $extraFieldLength);
+ if ($bytesRead != $extraFieldLength) {
+ return _ioError("reading local extra field");
+ }
+ $self->{'_localExtraFieldUserDefined'} = 0;
}
}
Here is a different kludgy approach to work around the issue in
strip-nondeterminism, rewriting the local file headers by hand:
--- /usr/share/perl5/File/StripNondeterminism/handlers/zip.pm
+++ File/StripNondeterminism/handlers/zip.pm
@@ -23,6 +23,7 @@
use File::Temp;
use Archive::Zip qw/:CONSTANTS :ERROR_CODES/;
+use Fcntl q/SEEK_SET/;
# A magic number from Archive::Zip for the earliest timestamp that
# can be represented by a Zip file. From the Archive::Zip source:
@@ -207,11 +208,36 @@
}
$member->cdExtraField(
normalize_extra_fields($member->cdExtraField(), CENTRAL_HEADER));
- $member->localExtraField(
- normalize_extra_fields($member->localExtraField(), LOCAL_HEADER));
}
my $old_perms = (stat($zip_filename))[2] & oct(7777);
$zip->overwrite();
+
+ # Archive::Zip::ZipFileMembers do not allow modifying the
+ # local extra field, so we need to rewrite the local file
+ # headers by hand. This assumes that normalize_extra_fields
+ # does not change the length of the field(s).
+
+ open(my $fh, '+<', $zip_filename) or die "Unable to open $zip_filename: $!";
+ for my $member ($zip->members()) {
+ my $extra_field = normalize_extra_fields($member->localExtraField(), LOCAL_HEADER);
+ my $offset = $member->writeLocalHeaderRelativeOffset();
+ my ($header, $signature, $namelength, $extralength);
+ if (seek($fh, $offset, SEEK_SET) and read($fh, $header, 30) == 30) {
+ $signature = unpack("V", substr($header, 0, 4));
+ $namelength = unpack("v", substr($header, 26, 2));
+ $extralength = unpack("v", substr($header, 28, 2));
+ }
+ if ($signature == 0x04034b50
+ and $extralength == length($extra_field)
+ and seek($fh, $offset + 30 + $namelength, SEEK_SET)) {
+ syswrite($fh, $extra_field);
+ }
+ else {
+ die "Cannot find local file header(s) in $zip_filename";
+ }
+ }
+ close($fh) or die "Error writing $zip_filename: $!";
+
chmod($old_perms, $zip_filename);
return 1;
}
-- System Information:
Debian Release: 9.4
APT prefers stable
APT policy: (990, 'stable'), (500, 'stable-updates'), (500, 'stable-debug')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 4.9.0-6-amd64 (SMP w/40 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages strip-nondeterminism depends on:
ii libfile-stripnondeterminism-perl 0.034-1
ii perl 5.24.1-3+deb9u2
strip-nondeterminism recommends no packages.
strip-nondeterminism suggests no packages.
-- no debconf information
More information about the Reproducible-builds
mailing list