[Python-modules-team] Bug#891725: This isn't even a scrapy bug.

Ian Turner vectro at vectro.org
Thu Jun 28 02:24:41 BST 2018


This failing test seems to be an issue with Python itself, rather than
Scrapy. Suggest just changing the test to match Python behavior.

This code calls through to w3lib.encoding.to_unicode, which just boils
down to this:

b"\xef\xbb\xbfWORD\xe3\xab".decode('utf-8', 'replace')

In which we can see the same results as the test:

On python 2:
>>> b"\xef\xbb\xbfWORD\xe3\xab".decode('utf-8', 'replace')
u'\ufeffWORD\ufffd'

On python 3:
>>> b"\xef\xbb\xbfWORD\xe3\xab".decode('utf-8', 'replace')
'\ufeffWORD�'

This bug is keeping python3-scrapy out of testing, can we just update
the test to accept this behavior?



More information about the Python-modules-team mailing list