[Debian-l10n-devel] Looking at translate-toolkit 1.10

Stuart Prescott stuart at debian.org
Tue Mar 26 03:58:44 UTC 2013


Dear translate-toolkit developers,

Firstly, thanks for releasing a new and shiny translate toolkit -- wonderful 
work! In coercing the new version into Debian packages, I've found a few 
things I need to talk to you about. I assume I will have more to ask about in 
the future, but this looks like a sufficiently long list to be getting on with 
right now.

cheers
Stuart


** langmodels **

The location of the .lm files in the package is a little mysterious to me. They 
have been shipping in translate/share/langmodels with in the python module 
namespace (they probably should be in /usr/share/translate or similar instead, 
but it's not too bad as it is). I'm not sure whether it is changes to setup.py 
or distutils, but they are now ending up in "share/langmodels" within the 
python module namespace. The changes to get_abs_data_filename() make me 
think that it is intended although probably in "/usr/share/langmodels" not 
within the module namespace. I'm unconvinced that translate-toolkit should 
take over "share" which is very a generic word in the python module namespace 
so I will try to push them off into /usr/share instead... but I will obviously 
need to contemplate carefully how to move them and how to test that change.

Jakub Wilk pointed out that these lm files have traditionally had encoding 
problems and I can see that we have the same problem with (at least) the 
polish lm file in translate toolkit [1].

[1]	http://bugs.debian.org/703942

This led me to wonder if 
	(a) translate-toolkit should perhaps import newer language models from 
upstream [2] 
	(b) translate-toolkit could make use of the code in this library (there 
are of course sometimes advantages in competing implementations), and 
	(c) the debian packages could/should use the libexttextcat-data package 
directly [3]. 

[2] http://cgit.freedesktop.org/libreoffice/libexttextcat/tree/langclass/LM
[3] http://bugs.debian.org/703943

Your thoughts on how we could proceed with that (in the long term) would be 
most welcome. Doing both (a) and (c) such that ttk follows upstream closely 
and the data is deduped seems reasonable from my naïve point of view (and (c) 
is much easier if (a) happens!).

Could you also confirm whether share/stoplist-en is the work of translate-
toolkit authors (and GPL'd) or whether it, like the lm files, has been imported 
from elsewhere? (it appeared as a largely complete file in svn r8066 but that 
doesn't tell me if it was just developed outside the VCS or if it has been 
borrowed)


** documentation **

The overhaul of your documentation system is huge. It's evident that quite a 
lot of work has gone into this. With such a large change, there are always 
going to be some minor niggles...

Is there a source file somewhere from which tbx_levels_structure.png was 
generated? It looks like the sort of line art that was created using a vector 
drawing tool (inkscape? libreoffice?) and not something where the PNG was drawn 
directly. It would be great if the preferred form for modification (such as an 
svg, xcf or odt file) were included in the source tree.

Is there some further source for docs/_themes/sphinx-bootstrap/ somewhere? In 
particular the embedded copy of a compressed jquery.js is problematic without 
source code. Your tarball and your documentation is explicitly released under 
the GPL, so you need to include the preferred form for modification (which is 
not the compressed javascript!) for the files you ship. For the time being I'm 
not sure if it's best to just revert to the "default" sphinx theme or to strip 
out that jquery and replace it with one from the Debian packages.

I also see the following errors when building the documentation:

reading sources... [  4%] api/misc
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/sphinx/ext/autodoc.py", line 321, in 
import_object
    __import__(self.modname)
  File "…/translate-toolkit/translate/misc/xmlwrapper.py", line 30, in 
<module>
    basicfixtag = ElementTree.fixtag
AttributeError: type object 'ElementTree' has no attribute 'fixtag'

I assume this is a know problem with ttk and python 2.7 [4], but I'm not sure 
yet if it's actually an issue or not for any real use of translate-toolkit. 
Thoughts?

[4] http://bugs.debian.org/644256


reading sources... [  4%] api/search
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/sphinx/ext/autodoc.py", line 321, in 
import_object
    __import__(self.modname)
  File "…/translate-toolkit/translate/search/indexing/PyLuceneIndexer1.py", 
line 30, in <module>
    import PyLucene
ImportError: No module named PyLucene

This module won't be used on Debian because we've only got the newer lucene so 
I guess this isn't an issue.


reading sources... [  6%] api/storage
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/sphinx/ext/autodoc.py", line 321, in 
import_object
    __import__(self.modname)
  File "…/translate/storage/subtitles.py", line 43, in <module>
    from gaupol.subtitle import Subtitle
ImportError: No module named gaupol.subtitle

I have the python-aeidon package installed and all the import calls for aeidon 
modules in storage/subtitles.py succeed if I run them in an interactive python 
shell, so I'm not sure what is failing when sphinx is using this module.

There are also a large number of sphinx warnings and a handful of errors 
generated when building the docs. They aren't fatal and I guess cleaning them 
up is on your todo list already (let me know if this is news to you and I can 
give you the output and look at some of them for you).


** manpages **

The ManHelpFormatter hasn't quite kept up with the transition to rst. It has 
always generated some slightly squiffy groff in its hyphens (I've got a patch 
that I will send you that solves that for options), but it is also now letting 
through some constructs like "..note" which will leave man very confused. I'll 
see if there's a way of filtering those to something suitable. I'll forward you 
some patches as soon as I've got something that isn't reimplementing all of 
rst2man.


** licence **

In your source files you are quite clear about the licence being GNU GPL v2 or 
any later version but in README.rst and docs/license.rst it just says "GPL" 
without saying which GPL (GNU GPL? Nethack GPL? v2+? v3 only?). Perhaps a link 
to the GPL online and/or mention of the COPYING file would be appropriate too.

-- 
Stuart Prescott    http://www.nanonanonano.net/   stuart at nanonanonano.net
Debian Developer   http://www.debian.org/         stuart at debian.org
GPG fingerprint    BE65 FD1E F4EA 08F3 23D4 3C6D 9FE8 B8CD 71C5 D1A8
GPG fingerprint    90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.alioth.debian.org/pipermail/debian-l10n-devel/attachments/20130326/25bfb340/attachment-0001.pgp>


More information about the Debian-l10n-devel mailing list