[Debian-med-packaging] We need a global decision about R data in binary format, and stick to it.

Joerg Jaspert joerg at debian.org
Mon Aug 5 21:27:01 UTC 2013


On 13294 March 1977, Jeremy Stanley wrote:
> On 2013-08-05 14:13:15 +0100 (+0100), Ian Jackson wrote:
>> The other is the assertion that this particular case involves a
>> generated data table. If this is the case then the source package
>> needs to contain the source code which generates the table - and,
>> really, it should regenerate the table during the build.
> [...]

> No argument on the first, but the second sets a bad precedent if
> interpreted strongly. For example I have a program which relies on a
> fairly large set of correlative data requiring hours of expensive
> computation to generate. In the source package I include the
> original data on which the resulting tables are based and provide a
> means to regenerate it on the fly at package build time, but disable
> it by default so that it doesn't chew up build resources
> unnecessarily.

> Since I need to generate the correlation data for other (non-Debian)
> users of the software anyway, I ship the generated files in the
> source package too and just include them in the binary package
> (along with instructions and tooling for the end user to be able to
> build datasets they can use to override the default ones provided).
> While my example is Python rather than R, I expect it's
> representative of situations for many scientific tools. Perhaps some
> guidance on when this tactic is or is not appropriate would be
> beneficial.

In general a package maintainer *can* decide to ship pre-build files
from upstream. In many cases that is a bad idea, in some cases its just
what one wants, but in all cases the maintainer then must ensure that
the shipped file can be regenerated from what is included in the source
tarball (minus things like possibly included build-dates). And that,
whatever is included in the source, is actually the preferred form of
modification, be that "real" source code or some binary data
representation (that can be edited by tools in the same component
level).

If its hard to check that, it might also be a good idea to mention this
somewhere visible in the (source) package.

> On 2013-08-05 16:41:13 +0100 (+0100), Ian Jackson wrote:
> [...]
> > There should IMO be a standard way to request a source package to do
> > from-scratch rebuilds for this kind of thing, for QA purposes.
> I absolutely agree. If there were a standard make target or envvar
> for this purpose I would gladly implement it in my debian/rules.

Oh yes. Start a process to get it? :)

-- 
bye, Joerg
http://www.bash.org/?203815
<Fooz> In a perfect world... spammers would get caught, go to jail, and
share a cell with many men who have enlarged their penisses, taken
Viagra and are looking for a new relationship.



More information about the Debian-med-packaging mailing list