Bug#586021: python-debian: patch for #586021

sean finney seanius at debian.org
Sat Jun 19 10:36:09 UTC 2010

Package: python-debian
Version: 0.1.16
Severity: normal

Hash: SHA1


attached is a patch that does something similar to what i was suggesting
earlier.  i don't make claims that it covers all vectors for this problem,
for example multi-valued fields might technically still have this problem
but i don't know of any such fields that might have mixed latin-1 and/or
utf-8 values and it's kinda a corner case to begin with, so i'm trying to
keep the patch as non-intrusive as possible.  the patch could be trivially
extended for this later if it were necessary, i just don't want to get my
paws dirty.

basically, i add an optional 'encoding' parameter to the __getitem__
mixin function which by default has the previous behavior but can be
overridden to supply an alternate encoding.  beyond that there are just
then a few points where the encoding parameter from dump() has to be carried
through to get to the location where the underlying method is being called.   

i thought this was just a bit better than another option that
came to mind (setting the object encoding temporarily and then setting
it back).  this latter option would make for a simpler patch but is both
aesthetically poor and techically kinda sketchy, so i opted for the former.

i've tested this on sources files from etch -> sid, and have had no problems.


- -- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.34-rc5minime-00802-g48f4092 (SMP w/2 CPU cores)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages python-debian depends on:
ii  python                        2.5.4-9    An interactive high-level object-o
ii  python-support                1.0.8      automated rebuilding support for P

Versions of packages python-debian recommends:
ii  python-apt                    0.7.95     Python interface to libapt-pkg

Versions of packages python-debian suggests:
ii  gpgv                          1.4.10-3   GNU privacy guard - signature veri

- -- no debconf information

- -- debsums errors found:
debsums: changed file /usr/share/pyshared/debian/deb822.py (from python-debian package)

Version: GnuPG v1.4.10 (GNU/Linux)

-------------- next part --------------
--- /usr/lib/pymodules/python2.5/debian/deb822.py.orig	2010-06-19 12:11:09.000000000 +0200
+++ /usr/lib/pymodules/python2.5/debian/deb822.py	2010-06-19 12:21:57.000000000 +0200
@@ -164,7 +164,7 @@
         self.__dict[key] = value
-    def __getitem__(self, key):
+    def __getitem__(self, key, encoding=None):
         key = _strI(key)
             value = self.__dict[key]
@@ -176,7 +176,10 @@
         if isinstance(value, str):
             # Always return unicode objects instead of strings
-            value = value.decode(self.encoding)
+            object_encoding = encoding
+            if not object_encoding:
+                object_encoding = self.encoding
+            value = value.decode(object_encoding)
         return value
     def __delitem__(self, key):
@@ -352,14 +355,17 @@
     # __repr__ is handled by Deb822Dict
-    def get_as_string(self, key):
+    def get_as_string(self, key, encoding=None):
         """Return the self[key] as a string (or unicode)
         The default implementation just returns unicode(self[key]); however,
         this can be overridden in subclasses (e.g. _multivalued) that can take
         special values.
-        return unicode(self[key])
+        if not encoding:
+            return unicode(self[key])
+        else:
+            return unicode(self.__getitem__(key, encoding=encoding))
     def dump(self, fd=None, encoding=None):
         """Dump the the contents in the original format
@@ -384,7 +390,7 @@
             encoding = self.encoding
         for key in self.iterkeys():
-            value = self.get_as_string(key)
+            value = self.get_as_string(key, encoding=encoding)
             if not value or value[0] == '\n':
                 # Avoid trailing whitespace after "Field:" if it's on its own
                 # line or the value is empty
@@ -873,7 +879,7 @@
             for line in filter(None, contents.splitlines()):
                 updater_method(Deb822Dict(zip(fields, line.split())))
-    def get_as_string(self, key):
+    def get_as_string(self, key, encoding=None):
         keyl = key.lower()
         if keyl in self._multivalued_fields:
             fd = StringIO.StringIO()
@@ -901,7 +907,7 @@
             return fd.getvalue().rstrip("\n")
-            return Deb822.get_as_string(self, key)
+            return Deb822.get_as_string(self, key, encoding)

More information about the pkg-python-debian-maint mailing list