Bug#586021: python-debian: can deb822.Sources can not handle Sources file with mixed data

John Wright jsw at debian.org
Tue Jun 15 20:40:33 UTC 2010

On Tue, Jun 15, 2010 at 09:20:31PM +0200, sean finney wrote:
> Package: python-debian
> Version: 0.1.16
> Severity: important
> I was updating the codebase for the debian patch tracker, and have stumbled
> across what i believe is a regression.  Now that python-debian uses unicode
> internally (since 0.1.15 it seems), if a Sources file contains both utf-8
> and latin-1 encoded maintainer names (like the etch Sources file does),
> then it seems impossible to produce output from the resulting Sources instance.

Ah, yuck. :(

I can think of two possible solutions:

  * Accept 'raw' as a Deb822 constructor encoding argument, or add a
    raw_strings keyword argument, that turns off the unicode behavior
    - Con: old code still breaks with mixed data - you have to change
      your code to use the new constructor argument
    - Pro: most consistent results (raw strings are only returned if you
      explicitly ask for them)
  * Wrap unicode stuff in try/except, and use the raw string if
    something goes wrong
    - Con: not as consistent results as above option
    - Pro: old code works out-of-box with mixed data

Which one do you think makes more sense?

John Wright <jsw at debian.org>

More information about the pkg-python-debian-maint mailing list