Bug#877418: dh-strip-nondeterminism: kills clojure performance

Rob Browning rlb at defaultvalue.org
Thu Oct 5 19:40:40 UTC 2017


Chris Lamb <lamby at debian.org> writes:

> I don't quite get what you mean I'm afraid. Filesystem ordering (at least
> via readdir/listdir, etc.) is non-deterministic. Can you explain it to me
> another way?

(...or quite likely I'm not describing things all that well.)

In Clojure's case, I'd think that setting the .clj mtime to at least 1s
before the corresponding .class file in the jar should work fine, though
if Clojure's only consulting the jar, then any other offset that
registers as smaller should also work, i.e. it might not have to be a
full second inside jars.

But sticking with at least 1s should make things a bit more general
because then if you

  jar xf foo.jar

the resulting tree will still show the right relative offsets on common
filesystems (assuming "jar x" tries to preserve mtimes) so that any
tool, clojure, some clojure build tool, etc. will still work as expected
with the tree.


...then I started thinking more generally and wondered if (eventually)
we might be able to do something even more broadly helpful.

If we were to take any archive we're rewriting (tar, jar, cpio), and
sort all the files by decreasing mtime, then assign the set of files
with the largest mtime to have some mtime_0, assign the set of files
with the second largest mtime to have (mtime_0 - 1s), the third set to
(mtime_0 - 2s), etc., we'd preserve the overall ordering among the
files so that something like:

   tar xf some-reproducible-archive.tgz
   cd some-reproducible-archive
   make

would stand a good chance of just working as it would have with the
original archive.

> I'd also be curious to know why you think *more* than one second could
> ever be needed here. I think I'm mising something.

I suspect 1s is just fine, and I have nothing concrete in mind here --
it just made me think of the general floating point issues (if any end
up involved in the path), e.g. 4.000...1 vs 4 vs 3.999... vs
rounding/truncation to the final value, etc.

Thanks
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4



More information about the Reproducible-builds mailing list