[Reproducible-builds] Reprotest week 59 blog comments

Ximin Luo infinity0 at debian.org
Sat Jun 18 01:22:14 UTC 2016

> On Fri, 2016-06-17 at 19:13 +0200, Ximin Luo wrote:
>>> For other packages, it's unclear to me whether I should specify
>>> them as depends or recommends: they aren't dependencies in a strict
>>> sense, but marking them as dependencies will make it easier to
>>> install a fully-functional reprotest.
>> You should specify these as Recommends, the definition matches what
>> you just described [1]. Also see how diffoscope does things.
>> [1] https://www.debian.org/doc/debian-policy/ch-relationships.html#s-
>> binarydeps starting from "The meaning of the five dependency fields
>> is as follows:"
> autopkgtest has these as Suggests rather than Recommends, and I
> followed its example.  On Lunar's suggestion, for the first alpha
> release I'm removing them altogether since the functionality that
> requires them isn't working yet.  When I get the functionality, I will
> put them back as Recommends if everyone agrees about that.

OK. I'd still say Recommends is better than Suggests, for when you finally get this working. The reason is, because a common use case would be to try to vary as much as you can. By contrast for autopkgtest, you typically pick one of the Suggests and you don't really get that much additional functionality by using all of them.

Some system configurations default to installing all of a package's Recommends by default (treating them as hard dependencies), including my own system. Much much fewer systems would treat Suggests like that, and it's a less reasonable thing to do in general.

>> I'd say it's better to fail fast. Warnings can easily be missed. If
>> the user really doesn't want to test a variation, they can disable it
>> using the mechanisms you already mentioned. 
> I will make it so that if a variation *is* available for a particular environment/container but can't be executed because something is missing, reprotest will error out.  Since not all environments/containers can test all variations, I don't think it will be useful to force users to go through and disable all the variations that can't be tested on the environment/container they want to run it on.  For instance, I'm not going to have reprotest complain that when building on an existing system, the variations that require superuser privileges aren't available.

Automatically enabling or disabling features based on "what's available" is less predictable, and I think it's better to be predictable than to require slightly less effort.

How about you have it fail fast by default, but if the user gives a --ignore-missing flag then you can switch to your autodetection behaviour? Then it's very obvious that the user is asking for something that's less predictable.

>>> Locales are a particular problem because I don't know of a way in
>>> Debian to specify that a given locale must be installed.
>> All locales are installed by default (unless you install the
>> "localepurge" package, which is an unsupported hack that you don't
>> need to worry about), so you just need to reconfigure the locales
>> package to "generate" the fr locale. I'm not sure how this works
>> exactly, but you can look into it. You can do it manually via `sudo
>> dpkg-reconfigure locales` but you might be able to script this within
>> reprotest.
> The bigger problem here is that since I'm designing reprotest not to be
> useful just for Debian-based systems, as far as I know there's no
> general way to control which locales are available.  What about BSD and
> MacOS?
> The prebuilder script hard-codes certain locales, which vary by
> architecture on tests.reproducible-builds.org.  From asking in IRC, as
> far as I know different locales were chosen for different architectures
> simply to test more different locales on t.rb.o.  Some accompanying
> questions about locales:
> * Should reprotest also hard-code certain locales?
> * If more than one locale is hard-coded, how does it pick which locales
> to test with?
> * Given the behavior above, where it errors out if something it needs
> to test a variation isn't available, should it error out if a hard
> -coded locale is missing, or should it fall back on some other locale? 
>  If it falls back, how should it pick which locale to fall back too?
> * Should it be able to test more than two different locales?  Should
> this be the default?  This makes things more complicated and
> potentially a lot slower.
> * How do I communicate to the users all of this locale handling in a
> transparent and simple way?
> I'm a little averse to hard-coding specific locales, but I don't have
> great solutions for any of these problems.

I'd suggest:

- Pick a random locale out of what's installed, and choose a random one out of this to test with. Or let the user specify it on the command line, as in --vary-locale=$$$, or the configuration file.

- If m locales are installed, and m<n where n is the number of reproductions attempts the user asks for, then either
 - if --ignore-missing is given, then don't vary the locale at all for the final n-m builds
 - else fail fast, and tell the user how to install new locales, or offer to do it for them

>>> While at the moment, reprotest only builds on the existing system,
>>> when I start extending it to other build environments, this will
>>> require double-dispatch, because the code that needs to be executed
>>> will depend on both the variation to be tested and the environment
>>> being built on.
> "Double-dispatch" in this case just means that there are two parameters
> that determine what code needs to be run, in this case the variation
> and the environment.

OK, "double dispatch" sounded to me like you were going to execute something (as in, do something effectful on the system) twice. That seemed to be unnecessary to me - even if you have two parameters, you still know this "in advance" so you should be able to figure out what you need to effect, then execute/dispatch a combined effect just once. But I didn't understand the details, so I don't know if my guess applies here. In general the word "dispatch" usually implies some sort of effectful operation like writing to a channel or mutating the filesystem; you wouldn't describe "calling a pure function" as "dispatch" for example.


GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE

More information about the Reproducible-builds mailing list