[Debian-l10n-devel] Removing long descriptions / english from Packages files

David Kalnischkies kalnischkies+debian at gmail.com
Wed Aug 19 09:31:25 UTC 2009

Hello everyone,

First of all an apology for the late response from the APT Team side,
but Michael Vogt is currently a bit busy with work, family stuff and
advising and helping me, so the intern discussion last a bit longer
and i don't want to enter the discussion with my half baked ideas. ;)

Second: There is a patch in the bakery for APT to handle multiply
Translation files. Changing apt-ftparchive to create a "master" file
is the only bigger thing left current, but it should not be a big deal
either -- but a few points need to discussed further, which a few are
already noted in the thread, so i will try to collect them all to have
something like a summary:

> Software breakage because of the remove of the long descriptions
As long as the software uses the libapt-pkg library to get the descs
(and i thing other libraries do the same) it should not have any impact:
The short description is included in the long description so the long
description is never empty -- only very short without Translation...

> (Sidenote: Can we PLEASE DROP MD5 when we are going to work on it?)
It is not a sidenote at all: APT uses the md5sum to map a (translated)
description to the untranslated package description. Imagine a user who
has stable and testing in it's sources.list: There would be multiply
descriptions for foobar and if the descriptions differs (because of new
features in foobar or because foobar is now a metapackage for foo and bar)
apt would not be able to know which one it should show to the user.
A mapping with the packageversion as key would be possible, but the
version changes far more often than the description resulting in diff
files for Translation-files in which only the versionnumbers get changed.
(Sidenote: There are currently no diff files for Translation,
 but this should be the topic of another discussion, bug and patch)
So in conclusion i think if no long description is provided in the Packages
file it should provide the Description-md5. If the long description is
provided it is optional, as it could be also calculated.
This leads also to the point if we should support both "formats" and
i think we should do it because smaller thirdparty archives doesn't provide
currently Translation-files at all - it would only increase the complexity
to create such archives by enforcing the remove of the long Description
from the Packages file for all archives.

> *I* think missing long descs right in the middle of a dist-upgrade is of
> small enough impact, especially as we want people to update apt and
> friends first in a seperate step anyways and can be dealt with with a
> proper paragraph in the release notes, and its not bad enough for us to
> have a legacy file around for a year.
To be clear: The timeframe without long Descriptions in english would be
between the two "apt-get update" runs in between the new apt will be
installed. So the Releasenote should suggest to run "apt-get update" again
after installing the new apt and friends before proceeding with the
rest of the upgrade.

> download of Translation files
The current apt downloads (or try it at least) the Translation file for
the Languagecode in LC_MESSAGES -- mostly only the two chars long code,
for a few (including en!) it tries the five chars long code (e.g. en_GB).
So changing the format now without changing apt would have the impact
that in a english local no long descriptions would be available - for
other locals only translated long desc would be available and as most
translations are unfortunately far from being complete this would be not
much long descriptions.
But as i already said it is a patch in bakery for apt to allow to define
which languages should be downloaded and in which order descs should
be shown. So i would suggest downloading the local (if possible) and en
as this is the behavior now and i think most users would wish such
a behavior. Nonetheless a message for none available long descs
would be nice and should be trivial to implement. The only problem here
is if each individual program should print something on its own or if
libapt-pkg (and the other libs) should add such a message...

The alternative to the partial Translation files would be, the idea
from Raphaël Hertzog which is optimal for the default, but increases
the download volume for all other cases.
> Why not inject the english version in the Translation file for all missing
> translations? That way you download only one Translation file and you have
> the usual behaviour.

And last but not least a question more or less only for the l10n-team:
APT currently includes a (short) list of languages for which it doesn't
download the Translation file with the short but with the long code.
e.g. for an pt_BR local it would download the file Translation-pt_BR file
instead of even trying to download Translation-pt. On the other hand
Translation-files like cs_CZ are never touched - apt only tries to download
the cs file. So what do you think: Should apt try downloading long and
short always OR short only if long is not available? The problematic here
would be (while currently looking at the l18n for unstable main [0]) that
e.g. the very small cs_CZ file would hide the larger cs file...
(btw: Also a suggestion which whitelist should be used would be good,
 e.g. i think it is unlikely that we get a de_?? in the future...)

Best regards / Mit freundlichen Grüßen,

David "DonKult" Kalnischkies

[0] http://ftp.debian.org/debian/dists/unstable/main/i18n/

P.S.: I hope i have cc'ed all lists, but feel free to add/fwd more.
It's my first time, so try to be nice if i made (many) mistakes... :)

More information about the Debian-l10n-devel mailing list