Changes in 0.1.15 and how to use same code on stable machines
John Wright
jsw at debian.org
Mon Apr 12 08:19:00 UTC 2010
(Including the list this time.)
On Sun, Apr 11, 2010 at 11:15:55AM +0200, Andreas Tille wrote:
> Hi John,
>
> On Sat, Apr 10, 2010 at 02:37:01PM -0600, John Wright wrote:
> > How about
> >
> > printstring = stanza[field].decode('utf-8')
> >
> > As far as I have tested, both str and unicode have a decode method, and
> > it looks like unicode doesn't really care about what argument you give
> > it.
>
> I have tried this previosely and it also fails with
>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 42: ordinal not in range(128)
Ah, bummer. I was trying with a unicode string that contained only
ascii characters...
> > The other option is checking whether the object is unicode or str type,
> > but I think the above works, and is cleaner.
>
> I would really like to have a clean solution but this does fail as well
> and I admit all these encoding issues are by far the most frustrating
> issues and are consuming about half the debugging time when dealing with
> non-ASCII content. :-(
>
> I'd be happy about any other suggestion
Well, it looks like you'll have to change every usage of stanza[field]
anyway, so how about this helper function:
def to_unicode(value, encoding='utf-8'):
if isinstance(value, str):
return value.decode(encoding)
else:
return unicode(value)
Then, everywhere you would have previously used something like
printstring = unicode(stanza[field], 'utf-8')
instead use
printstring = to_unicode(stanza[field], 'utf-8')
(or you can omit the 'utf-8' argument). It's still not perfectly
elegant, but it looks and should behave just like your original code
and also work with python-debian >= 0.1.15.
--
John Wright <jsw at debian.org>
More information about the pkg-python-debian-maint
mailing list