[Reproducible-builds] Reprotest week 59 blog comments

Wed Jul 20 15:39:19 UTC 2016

On Sat, 2016-06-18 at 03:22 +0200, Ximin Luo wrote:
> 
> Ceridwen:
> > 
> > 
> > On Fri, 2016-06-17 at 19:13 +0200, Ximin Luo wrote:
> > > 
> > > 
> > > > 
> > > > 
> > > > For other packages, it's unclear to me whether I should specify
> > > > them as depends or recommends: they aren't dependencies in a
> > > > strict
> > > > sense, but marking them as dependencies will make it easier to
> > > > install a fully-functional reprotest.
> > > You should specify these as Recommends, the definition matches
> > > what
> > > you just described [1]. Also see how diffoscope does things.
> > > 
> > > [1] https://www.debian.org/doc/debian-policy/ch-relationships.htm
> > > l#s-
> > > binarydeps starting from "The meaning of the five dependency
> > > fields
> > > is as follows:"
> > autopkgtest has these as Suggests rather than Recommends, and I
> > followed its example.  On Lunar's suggestion, for the first alpha
> > release I'm removing them altogether since the functionality that
> > requires them isn't working yet.  When I get the functionality, I
> > will
> > put them back as Recommends if everyone agrees about that.
> > 
> OK. I'd still say Recommends is better than Suggests, for when you
> finally get this working. The reason is, because a common use case
> would be to try to vary as much as you can. By contrast for
> autopkgtest, you typically pick one of the Suggests and you don't
> really get that much additional functionality by using all of them.
> 
> Some system configurations default to installing all of a package's
> Recommends by default (treating them as hard dependencies), including
> my own system. Much much fewer systems would treat Suggests like
> that, and it's a less reasonable thing to do in general.

Since no one else has said anything, I concur with your opinion and
have put them in as Recommends for the second release, which will
support schroot and qemu.  I also added disorderfs and locales-all to
the Recommends list, since they're necessary for some variations.

> 
> > 
> > 
> > > 
> > > 
> > > I'd say it's better to fail fast. Warnings can easily be missed.
> > > If
> > > the user really doesn't want to test a variation, they can
> > > disable it
> > > using the mechanisms you already mentioned. 
> > I will make it so that if a variation *is* available for a
> > particular environment/container but can't be executed because
> > something is missing, reprotest will error out.  Since not all
> > environments/containers can test all variations, I don't think it
> > will be useful to force users to go through and disable all the
> > variations that can't be tested on the environment/container they
> > want to run it on.  For instance, I'm not going to have reprotest
> > complain that when building on an existing system, the variations
> > that require superuser privileges aren't available.
> > 
> Automatically enabling or disabling features based on "what's
> available" is less predictable, and I think it's better to be
> predictable than to require slightly less effort.
> 
> How about you have it fail fast by default, but if the user gives a
> --ignore-missing flag then you can switch to your autodetection
> behaviour? Then it's very obvious that the user is asking for
> something that's less predictable.

In my personal user experience, there's nothing more frustrating than
software that, before it will do anything, requires that I dive into
the documentation without really knowing what I'm looking for to either
find the option(s) that will make the software just run, already, or to
work out what I need to install to get it to run.  Sometimes this is
unavoidable, but I think the default behavior should be to run out of
the box as much as possible.  I'm going to have reprotest show what
it's testing and not testing, but I'm not going to require that
everyone have all the possible tools for all the possible variations
working before it will run the first time without a command flag.  I do
intend to point to instruction on how to get variations that weren't
working running.

That said, at the moment, it just prints a stack trace with improper
setup :).  I will make this nicer before the end of the summer.

> 
> > 
> > 
> > > 
> > > 
> > > > 
> > > > 
> > > > Locales are a particular problem because I don't know of a way
> > > > in
> > > > Debian to specify that a given locale must be installed.
> > > All locales are installed by default (unless you install the
> > > "localepurge" package, which is an unsupported hack that you
> > > don't
> > > need to worry about), so you just need to reconfigure the locales
> > > package to "generate" the fr locale. I'm not sure how this works
> > > exactly, but you can look into it. You can do it manually via
> > > `sudo
> > > dpkg-reconfigure locales` but you might be able to script this
> > > within
> > > reprotest.
> > The bigger problem here is that since I'm designing reprotest not
> > to be
> > useful just for Debian-based systems, as far as I know there's no
> > general way to control which locales are available.  What about BSD
> > and
> > MacOS?
> > 
> > The prebuilder script hard-codes certain locales, which vary by
> > architecture on tests.reproducible-builds.org.  From asking in IRC,
> > as
> > far as I know different locales were chosen for different
> > architectures
> > simply to test more different locales on t.rb.o.  Some accompanying
> > questions about locales:
> > 
> > * Should reprotest also hard-code certain locales?
> > 
> > * If more than one locale is hard-coded, how does it pick which
> > locales
> > to test with?
> > 
> > * Given the behavior above, where it errors out if something it
> > needs
> > to test a variation isn't available, should it error out if a hard
> > -coded locale is missing, or should it fall back on some other
> > locale? 
> >  If it falls back, how should it pick which locale to fall back
> > too?
> > 
> > * Should it be able to test more than two different
> > locales?  Should
> > this be the default?  This makes things more complicated and
> > potentially a lot slower.
> > 
> > * How do I communicate to the users all of this locale handling in
> > a
> > transparent and simple way?
> > 
> > I'm a little averse to hard-coding specific locales, but I don't
> > have
> > great solutions for any of these problems.
> > 
> I'd suggest:
> 
> - Pick a random locale out of what's installed, and choose a random
> one out of this to test with. Or let the user specify it on the
> command line, as in --vary-locale=$$$, or the configuration file.
> 
> - If m locales are installed, and m<n where n is the number of
> reproductions attempts the user asks for, then either
>  - if --ignore-missing is given, then don't vary the locale at all
> for the final n-m builds
>  - else fail fast, and tell the user how to install new locales, or
> offer to do it for them

Let me ask another question: what are the features of the locales that
are hard-coded into the prebuilder script that make them good for
detecting locale-caused differences in builds?  They weren't chosen
arbitrarily, were they?

For selecting random locales, how is it possible to find out what
locales are available in a cross-platform way?

> 
> 
> > 
> > 
> > > 
> > > 
> > > > 
> > > > 
> > > > While at the moment, reprotest only builds on the existing
> > > > system,
> > > > when I start extending it to other build environments, this
> > > > will
> > > > require double-dispatch, because the code that needs to be
> > > > executed
> > > > will depend on both the variation to be tested and the
> > > > environment
> > > > being built on.
> > "Double-dispatch" in this case just means that there are two
> > parameters
> > that determine what code needs to be run, in this case the
> > variation
> > and the environment.
> > 
> OK, "double dispatch" sounded to me like you were going to execute
> something (as in, do something effectful on the system) twice. That
> seemed to be unnecessary to me - even if you have two parameters, you
> still know this "in advance" so you should be able to figure out what
> you need to effect, then execute/dispatch a combined effect just
> once. But I didn't understand the details, so I don't know if my
> guess applies here. In general the word "dispatch" usually implies
> some sort of effectful operation like writing to a channel or
> mutating the filesystem; you wouldn't describe "calling a pure
> function" as "dispatch" for example.

I was using the word like it's used in this article:

https://en.wikipedia.org/wiki/Multiple_dispatch

Maybe that's more of a programming-languages usage?  In that context,
calling a pure function can be "multiple-dispatched" if which code gets
executed depends on the types of the arguments to said function.

Ceridwen