[Python-modules-team] Bug#594146: namebench: benchmark stops with UnicodeDecodeError

Stefano Rivera stefanor at debian.org
Sat Apr 23 20:26:51 UTC 2011


Hi Konrad (2010.08.24_01:53:16_+0200)
> UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 5: invalid start byte
> Showing popup: 'utf8' codec can't decode byte 0xf6 in position 5: invalid start byte
> > 'utf8' codec can't decode byte 0xf6 in position 5: invalid start byte

My guess was that that came from GeoIP lookups. The Google one returns
JSON and *is* UTF-8 encoded, however, maxmind seems to be using
ISO-8859-1 which namebench attempts to treat as UTF-8:

From my Hetzner.de box in Nürnberg (a lucky location):

>>> result = urllib.urlopen('http://j.maxmind.com/app/geoip.js').read()
>>> print repr(result)
"function geoip_country_code() { return 'DE'; }\nfunction geoip_country_name() { return 'Germany'; }\nfunction geoip_city()         { return 'N\xfcrnberg'; }\nfunction geoip_region()       { return '02'; }\nfunction geoip_region_name()  { return 'Bayern'; }\nfunction geoip_latitude()     { return '49.4478'; }\nfunction geoip_longitude()    { return '11.0683'; }\nfunction geoip_postal_code()  { return ''; }\nfunction geoip_area_code()    { return ''; }\nfunction geoip_metro_code()   { return ''; }\n"
>>> result.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 140: invalid start byte
>>> print result.decode('ISO-8859-1')
function geoip_country_code() { return 'DE'; }
function geoip_country_name() { return 'Germany'; }
function geoip_city()         { return 'Nürnberg'; }
function geoip_region()       { return '02'; }
function geoip_region_name()  { return 'Bayern'; }
function geoip_latitude()     { return '49.4478'; }
function geoip_longitude()    { return '11.0683'; }
function geoip_postal_code()  { return ''; }
function geoip_area_code()    { return ''; }
function geoip_metro_code()   { return ''; }

SR

-- 
Stefano Rivera
  http://tumbleweed.org.za/
  H: +27 21 465 6908 C: +27 72 419 8559  UCT: x3127





More information about the Python-modules-team mailing list