Bug#876055: Environment variable handling for reproducible builds
Ximin Luo
infinity0 at debian.org
Tue Sep 19 09:14:00 UTC 2017
Simon McVittie:
> On Mon, 18 Sep 2017 at 18:00:51 -0700, Vagrant Cascadian wrote:
>> [..]
>>
>> I consider unintended variables that affect the build output a bug, and
>> variables designed and intended to change the behavior of the toolchain
>> expected, reasonable behavior.
>
> There is a *huge* number of variables that are intended to change
> behaviour, and may or may not affect the behaviour of this specific
> package. Which of your categories are these in?
>
> For example, basically any well-behaved programming language or
> programming-language-like environment has an equivalent of PYTHONPATH,
> PERL5LIB, PKG_CONFIG_PATH and similar variables, [..]
>
> Similarly, there is an intractably huge number of environment variables
> that can affect the result of Automake and make. Do you know about all
> of them? Including RM, PC, AR, LOADLIBES (and those are just for make's
> implicit rules)? [..]
>
I agree with this and this matches my own thoughts back in:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844431#324
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844431#369
> I think the assumption has to be that every environment variable is
> potentially intended to affect the build unless otherwise stated [..]
> [..] It would be most useful if we were to identify a
> restricted subset of environment variables for which there is consensus
> that the variable is meant to be merely user preference and shouldn't
> affect the build [..]
>
> Perhaps those variables should be a whitelist, or perhaps there is
> some wording for Policy that would identify them while excluding the
> legitimately build-affecting ones - but either way I think the
> assumption should be "there is a limited subset of environment
> variables that are required to preserve reproducibility when varied,
> and the rest are uninteresting".
>
These variables shouldn't be a whitelist because different buildsystems all the time can invent their own variables to affect themselves. We can't really "predict" something like PERL5LIB.
However, neither should it be a blacklist because different run-time programs invent their own variables all the time to affect themselves, but in a way that really should not affect build processes. I have to set LANG=XX.YY in my user environment, that doesn't mean that all my builds should run differently from people in other countries.
Therefore, I think it is better to try to reach some wording for Policy that communicates *intent*. Then, tools like dpkg-buildflag can have their own envvars that they force-set, which would be a subset of the ones allowed by Policy. Tools like reprotest can vary certain envvars that are "obviously" shouldn't affect the build like LC_ALL, USER, etc. Then in the middle there will be certain variables like RM and AR that could affect the build, which should be clear by Policy wording, but are too cumbersome to have dpkg-buildpackage try to enumerate a full whitelist and force-set them to a fixed value.
Interpreter variables like PER5LIB and PYTHONPATH we would have to assume fall in the first category ("they are allowed to affect the build output") even though arguably they are also "run-time variables" because they are very tied to the interpreter and probably only developers really want to set the for specific purposes.
So let's throw some wording out there already. To quote my earlier proposal:
> I would suggest amending:
>
> - a set of environment variable values; and
> + a set of reserved environment variable values; and
>
> then later:
>
> + A "reserved" environment variable is defined as DEB_*, DPKG_*, SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by dpkg-buildflags and other variables explicitly used by buildsystems to affect build output, excluding any variables used by non-build programs to affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER [..]
(The last time I erroneously included PATH in the final "excluded" list - because we have varied PATH but in a really trivial way on tests.r-b.org for ages - but I now agree with you that we shouldn't expect reproducibility when PATH is varied.)
My reasoning, as echoed by others on this thread already, was:
> some other variables are used by non-build tools, such as LC_*, USER, etc. Since they affect non-build programs, they possibly may be set in a developer's normal environment, so just running "debian/rules build" will pick these up. Then, the build should stay the same despite these other variables.
X
--
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git
More information about the Reproducible-builds
mailing list