Bug#844431: Revised patch: seeking seconds
infinity0 at debian.org
Wed Aug 16 18:17:00 UTC 2017
> Ximin Luo <infinity0 at debian.org> writes:
>> Fair enough. I actually spotted that but thought it was better to get
>> "something" into Policy rather than nitpick. I guess other people were
>> thinking similar things. Well, lesson learnt, I will be more forceful
>> next time.
>> The sentence I amended said "most environment variables" so our intent
>> is clear. If we want to fix this now, I would suggest amending:
>> - a set of environment variable values; and
>> + a set of reserved environment variable values; and
>> then later:
>> + A "reserved" environment variable is defined as DEB_*, DPKG_, SOURCE_DATE_EPOCH, BUILD_PATH_PREFIX_MAP, variables listed by dpkg-buildflags and other variables explicitly used by buildsystems to affect build output, excluding any variables used by non-build programs to affect their behaviour. Explicitly, this excludes TERM, HOME, LOGNAME, USER, PATH and likely any variables ending with *PATH.
> We intentionally didn't spell this out in this much detail because it felt
> better to defer this (stricter) bar until we have documentation of the
> *.buildinfo file, and also because we were worried about the list changing
> (once it goes into Policy, it's more irritating to change). The current
> standard in Policy is intentionally weaker than this in order to be
> I still lean towards taking this approach, because I'm pretty worried
> about the scope of:
> other variables explicitly used by buildsystems to affect build output
> That's not really an enumerable list. My recommendation, if you want to
> allow some environment variables to vary without affecting
> reproducibility, is to explicitly list the set of environment variables
> that can vary, rather than trying to list the ones that have to remain
Intuitively it feels weird to say "if you vary USER, the output must remain fixed", but also "if you vary RANDOMUNIQUESPECIALSNOWFLAKEVARIABLE then the output is allowed to change".
Certain environment variables have become convention to affect a build, like CFLAGS, and even debuild(1) doesn't clear them - but clears the other envvars. That is what I was going on.
> But, more fundamentally, I'm dubious that weakening the environment
> variable set is a good use of anyone's time. Why not define reproducible
> builds as setting a specific set of environment variables and no others?
> We're long past the point where building packages in an isolated
> environment with a fixed set of environment variables is a great hardship
> or even particularly unusual. I think the effort would be better spent on
> fixing (with enumerated exceptions) the set of environment variables set
> by buildds, sbuild, pbuilder, and other infrastructure that builds
> packages than in making packages tolerate random environment variables
> being set during the build. It's really hard to track down all the
> environment variable settings that might affect Autoconf, the build tools,
> document formatters, and so forth.
My proposal was the opposite, to *strengthen* the definition that was already accepted - I *don't* think we should track down all those variables and make packages immune to them, that is why I added "other variables explicitly used by buildsystems to affect build output" etc. OTOH, some other variables are used by non-build tools, such as LC_*, USER, etc. Since they affect non-build programs, they possibly may be set in a developer's normal environment, so just running "debian/rules build" will pick these up. Then, the build should stay the same despite these other variables.
If a build tool needs to be run in a specific locale, it should either use a locale-independent sorting program, or set LC_ALL explicitly itself regardless of what the parent environment says.
This doesn't contradict us from using a fixed or mostly-clean environment in sbuild, pbuilder, debuild, etc.
Now that I think about it however, it's probably not reasonable to expect that the output remains the same when PATH is changed. On tests.r-b.org we vary it by appending a dummy value  but if the user adds their own stuff to the beginning then the output may well change. There is probably no point in trying to prevent that in all packages. In a sense, it does very much affect what build tools are run, even though non-build programs also use it. However, my gut feeling still says that it's not right for the locale (LC_*) to affect a build process. I will try to think of a more precise way to express this difference.
More information about the Reproducible-builds