[Python-apps-team] Bug#703944: subdownloader: language model (*.lm) files shared with libexttextcat-data

Stuart Prescott stuart at debian.org
Tue Mar 26 00:10:45 UTC 2013


Package: subdownloader
Version: 2.0.14-1
Severity: wishlist

Hi!

The subdownloader package ships a set of language model files to help identify
languages based on their text. These files were originally imported from the
OpenOffice.org project by the subdownloader upstream. The files are also
in Debian as part of the libexttextcat-data package. The original language
models have been expanded over time and bugs fixed in them, while the copies
have not.

Additionally, the language files in subdownloader appear to vary in
their encoding which makes me wonder how encoding-aware subdownloader is --
see also #692241.

Perhaps it would be better for subdownloader to use libexttextcat-data
data files?

cheers
Stuart



More information about the Python-apps-team mailing list