[Debian-med-packaging] Question about proper archive area for packages that require big data for operation

Olivier Sallou olivier.sallou at irisa.fr
Wed Apr 24 06:20:00 UTC 2013


On 04/23/2013 11:48 AM, Laszlo Kajan wrote:
> Dear Russ, Debian Med Team, Charles!
>
> (Please keep Tobias Hamp in replies.)
>
> @Russ: Please allow me to include you in a discussion about a few bioinformatics packages that depend on big, but free data [2]. I have cited
> your opinion [3] in this discussion before. You are on the technical committee and on the policy team, so you, together with Charles, can help
> substantially here.
>
> [2] http://lists.alioth.debian.org/pipermail/debian-med-packaging/2013-April/thread.html
> [3] https://lists.debian.org/debian-vote/2013/03/msg00279.html
>
> This email is to continue the discussion about free packages that depend on big (e.g. >400MB) free data outside 'main'. These packages
> apparently violate policy 2.2.1 [0] for inclusion in 'main' because they require software outside the 'main' area to function. They do not
> violate point #1 of the social contract [1], which requires non-dependency on non-free components. For these big data packages, policy seems to
> be overly restrictive compared to the social contract, leading to seemingly unfounded rejection from 'main'.
Indeed, many bioinformatics programs relies on external data. But I am
afraid that if we start to add some data packages, we will open an
endless open door.... BioInformatics datasets are large, and becoming
huge and numerous.
This size will be an issue for Debian mirrors (mainly if some indexed
data are system dependent) but will also be a pain for the user if, when
installing a program (to have a look), it downloads GBs of dependent
packaged data. It may be really slow and fill the user disk (and I do
not talk of package updates).

Should not those data dependency clearly stated somewhere with the
software package, with a script to get them ?

Olivier
>
> [0] http://www.debian.org/doc/debian-policy/ch-archive.html
> [1] http://www.debian.org/social_contract
>
> * In case the social contract indeed allows such packages to be in 'main' (and policy is overly restrictive), how could it be ensured that the
> packages are accepted?
>
> * What is the procedure within Debian to elicit a decision about the handling of such packages in terms of archive area? Discussion on d-devel,
> followed by policy change? Asking the policy team to clarify policy for such packages? Technical committee?
>
>  + Charles suggested such packages could go into 'main' [4], with a clear indication of the large data dependency of the package in the long
> description.
>    When possible, providing the scripts for generating the large data as well.
>
>  [4] http://lists.alioth.debian.org/pipermail/debian-med-packaging/2013-April/019292.html
>
> My goal as a Debian Developer and a packager is to get packages into Debian (so 'main') that are allowed in there, in reasonably short time. I
> would like to resolve this issue properly, because I believe it may pop up more often in bioinformatics software. For example, imagine a protein
> folding tool that would require a very large database to search for homologues for contact prediction, and using the contacts it would predict
> protein three-dimensional structure. This has been done before [5], and such a tool would be (is) immensely useful for bioinformatics. This tool
> would depend on gigabytes of data we would not package. Yet, by all means, I would want the tool to be part of the distribution.
>
> [5] http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028766
>
> Thank you for your opinion and advice.
>
> Best regards,
> Laszlo
>
> _______________________________________________
> Debian-med-packaging mailing list
> Debian-med-packaging at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-med-packaging

-- 
Olivier Sallou
IRISA / University of Rennes 1
Campus de Beaulieu, 35000 RENNES - FRANCE
Tel: 02.99.84.71.95

gpg key id: 4096R/326D8438  (keyring.debian.org)
Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438




More information about the Debian-med-packaging mailing list