[Debian-l10n-devel] Bug#658621: translate-toolkit: pocommon.py raises UnicodeEncodeError on UTF-8 encoded translator comments

Stuart Prescott stuart+debian at nanonanonano.net
Sat Feb 4 16:57:27 UTC 2012


Package: translate-toolkit
Version: 1.9.0-1
Severity: normal
Tags: upstream patch

Hi!

The translate-toolkit's po reading code ends up choking on translator
comments that are UTF-8 encoded. A fix for the issue has been committed to
upstream svn but there is no ETA for a release of the next version.

  http://bugs.locamotion.org/show_bug.cgi?id=1951

Several of the example terminology files shipped with pootle include problematic
translator comments hence fixing this is a pre-requisite to being able to ship
a pootle package.

The attached patch backports this fix to the translate-toolkit currently
in Debian. I have been using a package containing this patch in testing out
the current pootle packaging.

cheers
Stuart
-------------- next part --------------
Description: Handle errors in the po encoding
Origin: http://translate.svn.sourceforge.net/viewvc/translate/src/trunk/translate/storage/pocommon.py?view=patch&r1=17737&r2=17736&pathrev=17737
Bug: http://bugs.locamotion.org/show_bug.cgi?id=1951
Author: Stuart Prescott <stuart+debian at nanonanonano.net>

Based on patch applied to upstream svn and shouldn't be needed in the release
following 1.9.0.

--- a/translate/storage/pocommon.py
+++ b/translate/storage/pocommon.py
@@ -47,7 +47,12 @@
 
 def unquote_plus(text):
     """unquote('%7e/abc+def') -> '~/abc def'"""
-    return urllib.unquote_plus(text).decode('utf-8')
+    try:
+        return urllib.unquote_plus(text).decode('utf-8')
+    except UnicodeEncodeError, e:
+        # for some reason there is a non-ascii character here. Let's assume it
+        # is already unicode (because of originally decoding the file)
+        return text
 
 
 class pounit(base.TranslationUnit):


More information about the Debian-l10n-devel mailing list