[Debian-med-packaging] Question about proper archive area for packages that require big data for operation

Laszlo Kajan lkajan at debian.org
Tue Apr 23 09:48:05 UTC 2013


Dear Russ, Debian Med Team, Charles!

(Please keep Tobias Hamp in replies.)

@Russ: Please allow me to include you in a discussion about a few bioinformatics packages that depend on big, but free data [2]. I have cited
your opinion [3] in this discussion before. You are on the technical committee and on the policy team, so you, together with Charles, can help
substantially here.

[2] http://lists.alioth.debian.org/pipermail/debian-med-packaging/2013-April/thread.html
[3] https://lists.debian.org/debian-vote/2013/03/msg00279.html

This email is to continue the discussion about free packages that depend on big (e.g. >400MB) free data outside 'main'. These packages
apparently violate policy 2.2.1 [0] for inclusion in 'main' because they require software outside the 'main' area to function. They do not
violate point #1 of the social contract [1], which requires non-dependency on non-free components. For these big data packages, policy seems to
be overly restrictive compared to the social contract, leading to seemingly unfounded rejection from 'main'.

[0] http://www.debian.org/doc/debian-policy/ch-archive.html
[1] http://www.debian.org/social_contract

* In case the social contract indeed allows such packages to be in 'main' (and policy is overly restrictive), how could it be ensured that the
packages are accepted?

* What is the procedure within Debian to elicit a decision about the handling of such packages in terms of archive area? Discussion on d-devel,
followed by policy change? Asking the policy team to clarify policy for such packages? Technical committee?

 + Charles suggested such packages could go into 'main' [4], with a clear indication of the large data dependency of the package in the long
description.
   When possible, providing the scripts for generating the large data as well.

 [4] http://lists.alioth.debian.org/pipermail/debian-med-packaging/2013-April/019292.html

My goal as a Debian Developer and a packager is to get packages into Debian (so 'main') that are allowed in there, in reasonably short time. I
would like to resolve this issue properly, because I believe it may pop up more often in bioinformatics software. For example, imagine a protein
folding tool that would require a very large database to search for homologues for contact prediction, and using the contacts it would predict
protein three-dimensional structure. This has been done before [5], and such a tool would be (is) immensely useful for bioinformatics. This tool
would depend on gigabytes of data we would not package. Yet, by all means, I would want the tool to be part of the distribution.

[5] http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0028766

Thank you for your opinion and advice.

Best regards,
Laszlo



More information about the Debian-med-packaging mailing list