Bug#876055: Environment variable handling for reproducible builds

Ximin Luo infinity0 at debian.org
Mon Oct 2 16:57:00 UTC 2017


Ximin Luo:
> [..]
> 
> OTOH, developer reproducibility checkers (such as reprotest) can be a little bit more strict. I can imagine something like:
> 
> - reprotest runs 3 builds:
>   - build 0 with current env
>   - build 1 with current env + varying some "blacklist" envvars
>   - build 2 with current env + varying some "non-whitelist" envvars
> 
> If there are differences between build 1 and build 2, then reprotest reports "unexpected envvar $XXX affected the build" and the developer can then either submit it for inclusion on the "whitelist" or the "blacklist" based on the Policy wording. If it ends up on the blacklist then they would also have to fix their own package to be invariant under that envvar.
> 
> So over time, this way we can build up a blacklist and a whitelist. But it shouldn't be in the original policy. And I don't think what I suggested above is a particularly disruptive or surprising process, especially since the "public" builders would only do the "looser" interpretation so people aren't bothered by bogus "unreproducible" reports.
> 

I've implemented this in reprotest here in the "env-build" branch:
https://anonscm.debian.org/cgit/reproducible/reprotest.git/log/?h=env-build

It requires the python3-rstr package which is currently in NEW or you can get it here: 
https://people.debian.org/~infinity0/apt/pool/main/p/python-rstr/

Run it like this:

$ PYTHONPATH=$PWD python3 -m reprotest --env-build 'env > out || true' out
[..]
--- /tmp/tmp1ujyb3xp/control
+++ /tmp/tmp1ujyb3xp/experiment-blacklist
├── source-root
│ ├── out
│ │ @@ -1,57 +1,47 @@
[.. big diff ..]
Unreproducible even when varying blacklisted envvars:  BROWSER, CLUTTER_IM_MODULE, COLORTERM, COLUMNS, DATEMSK, DBUS_SESSION_BUS_ADDRESS, [..] ftp_proxy, http_proxy, https_proxy
This may or may not be caused by other factors; try re-running this again with --vary=-all
# exit code 1

$ PYTHONPATH=$PWD python3 -m reprotest --env-build 'env | grep UNKNOWN > out || true' out
[..]
--- /tmp/tmp2m24l442/control
+++ /tmp/tmp2m24l442/experiment-non-whitelist
├── source-root
│ ├── out
│ │ @@ -0,0 +1,10 @@
│ │ +00000000: 5245 5052 4f54 4553 545f 4341 5054 5552  REPROTEST_CAPTUR
│ │ +00000010: 455f 454e 5649 524f 4e4d 454e 545f 554e  E_ENVIRONMENT_UN
│ │ +00000020: 4b4e 4f57 4e5f 314b 6254 4a76 6362 6749  KNOWN_1KbTJvcbgI
│ │ +00000030: 464a 7661 394a 364d 6762 417a 7a57 5377  FJva9J6MgbAzzWSw
│ │ +00000040: 5f54 4243 3053 7177 5748 4f6a 4b44 4b39  _TBC0SqwWHOjKDK9
│ │ +00000050: 4352 6144 344d 5048 6d4d 3432 555f 795a  CRaD4MPHmM42U_yZ
│ │ +00000060: 7546 7530 6149 6330 3251 6438 365f 6e70  uFu0aIc02Qd86_np
│ │ +00000070: 5946 5330 436d 6c6f 6c45 6553 5258 7756  YFS0CmlolEeSRXwV
│ │ +00000080: 313d 695f 6361 7074 7572 655f 7468 655f  1=i_capture_the_
│ │ +00000090: 656e 7669 726f 6e6d 656e 740a            environment.
Unreproducible when varying unknown envvars:  CAML_LD_LIBRARY_PATH, MALLOC_PERTURB_, OCAMLPARAM, OCAML_TOPLEVEL_PATH, OPAMKEEPBUILDDIR, REPROTEST_CAPTURE_ENVIRONMENT_UNKNOWN_\w+, [A-Z]{2,5}(_[A-Z]{2,5}){1,3}, [A-Z]{2,5}(_[A-Z]{2,5}){1,3}
Please file a bug to reprotest to add these to the whitelist or blacklist, to be decided.
If blacklist, then you should also make your program reproducible when varying them.
# exit code 1

$ PYTHONPATH=$PWD python3 -m reprotest --env-build 'echo > out || true' out
[..]
=======================
Reproduction successful
=======================
No differences in ./out
01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b  ./out

The blacklist and whitelist is hard-coded here:
https://anonscm.debian.org/cgit/reproducible/reprotest.git/tree/reprotest/environ.py?h=env-build#n9

Of course the list contents - actual, as well as potential - are what are currently up for debate. But at least here's a live demonstration that my approach is more-or-less practically testable.

X

-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git



More information about the Reproducible-builds mailing list