[Debian-med-packaging] Question about proper archive area for packages that require big data for operation

Laszlo Kajan kajla at dns6.org
Wed Apr 24 15:18:46 UTC 2013


Hello Simon!

Thank you for these suggestions.

On 24/04/13 13:06, Simon McVittie wrote:
> On 23/04/13 10:48, Laszlo Kajan wrote:
>> free packages that depend on big (e.g. >400MB) free data outside 'main'
> 
> This comes up in the Games Team, too.
> 
> Here are some possibilities you might not have considered:
> 
> * Package a small "demo" data-set (enough to test that the package is
>   working correctly) in main; provide instructions to get the
>   "full-fat" data-set from elsewhere. I think VegaStrike used to do
>   this with its music, shipping a lower-quality encode in Debian and a
>   full-quality encode elsewhere? Games also often do this for legal
>   rather than size reasons, with an engine in contrib, demo/shareware
>   data in non-free, and instructions to replace the demo data with the
>   non-distributable full game if you own it; e.g. Quake II used to be
>   packaged like this.

This has come up before in our discussion. For this particular package ('metastudent'), the entire data is necessary to obtain the published
functionality and performance. FTP Masters' decision to reject the original upload that depended on large external data, was based on this
dependency for functioning (policy 2.2.1). Providing a small demo dataset would render the tool practically useless, although it would look as
if it could function. In my view this solution would only obscure the fact that the package really does require the large data set. I would not
want that FTP masters accept 'metastudent' because I made its data dependency /seem/ to be solved within 'main'.

> * Split the data-set into reasonably-sized packages so it at least
>   gets better incremental downloads and splitting between CDs/DVDs
>   (bonus points if the source packages are segregated by update
>   frequency, so only the frequently-updated parts normally need
>   uploads). I did this with openarena-data (after some brief discussion
>   with the ftp-masters and the debian-cd maintainer) because I was sick
>   of uploading half a gigabyte of textures, etc. every time there was a
>   bug in the game scripting. They suggested that I should aim for 100MB
>   packages as a reasonable compromise between splitting too coarsely
>   and too finely.

This is a great idea. It seems we may be able to get the data down to ~130MB xz compressed (for metastudent, but this would not be the case for
e.g. predictprotein). We may be able to split this to two smaller packages. Indeed, we are update-frequency aware... trying to separate out what
might change more often.

Best regards, Laszlo



More information about the Debian-med-packaging mailing list