Large data packages in the archive
Joerg Jaspert
joerg at ganneff.de
Sun May 25 18:18:01 UTC 2008
Hi,
one important question lately has been "What should we do with large
packages containing data", like game data, huge icon/wallpaper sets,
some science data sets, etc. Naturally, this is a decision ftpmaster has
to take, so here are our thoughts on it to facilitate discussion and see
if we missed important points but we keep the right to have the last
word how it gets done. :)
Basic Problem: "What to do with large data packages?"
That already has a problem: How to define "large"? One way, which we
chose for now, is simply "everything > 50MB".
While the archive software is written in Python, this problem sounds
like a Perl one as "There is more than one way to do (solve) it":
a.) We can simply say that we don't want this in Debian and people
should use external hosting for such packages. After all they are
for a very small minority usually.
b.) We can just add another component "data" besides
main/contrib/non-free.
c.) We can host an own archive for it under control of ftpmaster.
The first two seem to have grave problems:
a.) Is basically no (good) option. It is our job to maintain the
archive, and if there is enough demand we should make it possible to
also host things like these data packages. Additionally it has the
problem that it would require a move of everything that needs those
data packages into contrib, as there wouldn't be a good base for a
Policy exception.
b.) While that would be the most simple solution it has other problems,
large enough that we decided against it. The biggest one being that
of the principle of least surprise for our mirrors. We are talking
about this to not bloat the main archive too much. If we just add
another component stuff will end up mirrored a lot. Even if we send
an announcement weeks before. Requiring every mirror admin to take a
decision if they want to mirror or exclude it, then adjust their
scripts, is a simple no-go for us.
So the way to go for us seems to be c.), hosting the archive ourself
(somewhere below data.debian.org probably).
For all the rest of the mail I talk about solution c., unless otherwise
stated.
So assume we go for solution c. (which is what happens unless someone
has a *very* strong reason not to, which I currently can't imagine) we
will setup a seperate archive for this. This will work the same way as
our main archive does, with a few notable points:
- It will be solely arch:all, not splitted per architecture. Or, if
someone presents *good* reasons why a data archive needs to be
architecture-aware, we will also offer this, but *NO* autobuilder
support will be provided.
This is meant as a place for large datasets, and those should be
arch independent. And would kill many autobuilders (think of binary
packages having 500, 800 or more megabytes!)
- It is an own archive, so it needs full source uploads to work,
every data package you create will be a full source package and you
have to split the source between this archive and the rest that goes
into the normal Debian one.
- We need to change policy. It currently forbids packages in main to
Depend/Recommend something outside of it (which is good). As that
would make the data archive less useful, I propose to change this to
something including the meaning of "Packages in main are allowed to
recommend packages in the data archive".
Dependencies should *not* be allowed, but read the next point.
- Packages in main need to be installable and not cause their (indirect)
reverse build-depends to FTBFS in the absence of data.debian.org.
If the data is necessary for the package to work and there is a small
dataset (like 5 to 10 MB) that can be reasonably substituted for the
complete data package, the smaller dataset should be included in
main and the package then may depend on "foo-data | foo-data-small".
Any comments?
Timeframe for this? I expect it to be ready within 2 weeks.
--
bye, Joerg
Some AM after a mistake:
Sigh. One shouldn't AM in the early AM, as it were. <grin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 250 bytes
Desc: not available
Url : http://lists.alioth.debian.org/pipermail/pkg-games-devel/attachments/20080525/3845b103/attachment.pgp
More information about the Pkg-games-devel
mailing list