[Python-apps-team] Bug#845987: planet-venus: Fails with html5lib 0.999999999-1
Jakob Haufe
sur5r at sur5r.net
Sun Nov 27 14:04:39 UTC 2016
Package: planet-venus
Version: 0~git9de2109-4
Severity: important
Tags: patch
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Dear Maintainer,
after updating python-html5lib to 0.999999999-1, planet-venus fails
with:
ERROR:planet.runner:TypeError: __init__() got an unexpected keyword argument 'encoding'
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/planet/spider.py", line 484, in spiderPlanet
writeCache(uri, feed_info, data)
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/planet/spider.py", line 293, in writeCache
reconstitute.source(xdoc.documentElement,data.feed,data.bozo,format)
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 240, in source
content(xsource, 'subtitle', source.get('subtitle_detail',None), bozo)
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 170, in content
html = parser.parse(xdiv % detail.value, encoding="utf-8")
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 235, in parse
self._parse(stream, False, None, *args, **kwargs)
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 85, in _parse
self.tokenizer = _tokenizer.HTMLTokenizer(stream, parser=self, **kwargs)
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/html5lib/_tokenizer.py", line 36, in __init__
self.stream = HTMLInputStream(stream, **kwargs)
ERROR:planet.runner: File "/usr/lib/python2.7/dist-packages/html5lib/_inputstream.py", line 151, in HTMLInputStream
return HTMLBinaryInputStream(source, **kwargs)
Traceback (most recent call last):
File "/usr/bin/planet", line 143, in <module>
doc = splice.splice()
File "/usr/lib/python2.7/dist-packages/planet/splice.py", line 84, in splice
reconstitute.source(xdoc.documentElement, data.feed, None, None)
File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 240, in source
content(xsource, 'subtitle', source.get('subtitle_detail',None), bozo)
File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 170, in content
html = parser.parse(xdiv % detail.value, encoding="utf-8")
File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 235, in parse
self._parse(stream, False, None, *args, **kwargs)
File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 85, in _parse
self.tokenizer = _tokenizer.HTMLTokenizer(stream, parser=self, **kwargs)
File "/usr/lib/python2.7/dist-packages/html5lib/_tokenizer.py", line 36, in __init__
self.stream = HTMLInputStream(stream, **kwargs)
File "/usr/lib/python2.7/dist-packages/html5lib/_inputstream.py", line 151, in HTMLInputStream
return HTMLBinaryInputStream(source, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'encoding'
Fixing this results in another error regarding the sanitizer. See [1] and [2].
The attached patch makes planet-venus work again. It should probably be
incorporated into debian/patches/html5lib-no_XHTMLSerializer.patch.
Cheers,
sur5r
[1] https://github.com/html5lib/html5lib-python/issues/277
[2] https://github.com/html5lib/html5lib-python/issues/72
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEe/X2rDZDH11A3BN6TPKyGPVNrj0FAlg65/MACgkQTPKyGPVN
rj1l6BAAqQyCb4TzzZ5ueiBhp5OTY7U5z+8SP4rquuD+4bMaSq6sZuDkwH/mk71E
+rXt5/EsUezRoIjvmRpOlP/1ANDNnidhoxz7OttHBiRWZQUZ/QG6HlSF4t3BOOUY
J87zTwMJJC0aM2CRod5K30EUX2eDnmbrEyMJ5DqL2aSl+V8I7tH+9ttTK7myeW25
C0y8S2D3GWCn3pjMh3PsKk6zEkX+3niERpXfXNHytlrYuBEJI4hG9xi6g7sHN9ds
dhaiopTbUonEQhHkpzKwmPc08IcMvwO/xTCecrtsiTGs1wRi5I7uxmRwySljVzDS
AuIm3cEz/Qy8SzDkDc7eWYrk7LxYE2vcJ4PZlNy75sSWoDsq0LYbmcHQq7vtrHhd
dlctzLSEx9v0MUtNcjz6iCCdFBnVdJS3VTLjCqmlt4p1c0LgbeZeuokmIhIb3s/Q
kClegb1wcuqcw3PKxMjZdUWEg7/gh84aDf/d2kb2+r+B54XXhysQM9eXpTPm24Hx
ushQZ99At/mxFEbY1UmlvUmMjfNdEV402riDUlKUGR7f+10dWvxY2cRRSZc+fXGj
cmAeT8xZa8aAZ2ou9Qmq/8/ixK9ez+A0VFgKBV69wqPzQx2fG3Omy3AY+/encjGp
cjF0QqpbRc5fswiNI9e7Y5b2E2R1kiSo6qduSB323ejYf0tQHAI=
=Lnir
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: html5lib-stretch.patch
Type: text/x-diff
Size: 1740 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/python-apps-team/attachments/20161127/546b24ee/attachment.patch>
More information about the Python-apps-team
mailing list