[Python-apps-team] Bug#845987: planet-venus: Fails with html5lib 0.999999999-1

Jakob Haufe sur5r at sur5r.net
Sun Nov 27 14:04:39 UTC 2016


Package: planet-venus
Version: 0~git9de2109-4
Severity: important
Tags: patch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Dear Maintainer,

after updating python-html5lib to 0.999999999-1, planet-venus fails
with:

ERROR:planet.runner:TypeError: __init__() got an unexpected keyword argument 'encoding'
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/planet/spider.py", line 484, in spiderPlanet
    writeCache(uri, feed_info, data)
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/planet/spider.py", line 293, in writeCache
    reconstitute.source(xdoc.documentElement,data.feed,data.bozo,format)
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 240, in source
    content(xsource, 'subtitle', source.get('subtitle_detail',None), bozo)
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 170, in content
    html = parser.parse(xdiv % detail.value, encoding="utf-8")
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 235, in parse
    self._parse(stream, False, None, *args, **kwargs)
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 85, in _parse
    self.tokenizer = _tokenizer.HTMLTokenizer(stream, parser=self, **kwargs)
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/html5lib/_tokenizer.py", line 36, in __init__
    self.stream = HTMLInputStream(stream, **kwargs)
ERROR:planet.runner:  File "/usr/lib/python2.7/dist-packages/html5lib/_inputstream.py", line 151, in HTMLInputStream
    return HTMLBinaryInputStream(source, **kwargs)
Traceback (most recent call last):
  File "/usr/bin/planet", line 143, in <module>
    doc = splice.splice()
  File "/usr/lib/python2.7/dist-packages/planet/splice.py", line 84, in splice
    reconstitute.source(xdoc.documentElement, data.feed, None, None)
  File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 240, in source
    content(xsource, 'subtitle', source.get('subtitle_detail',None), bozo)
  File "/usr/lib/python2.7/dist-packages/planet/reconstitute.py", line 170, in content
    html = parser.parse(xdiv % detail.value, encoding="utf-8")
  File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 235, in parse
    self._parse(stream, False, None, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/html5lib/html5parser.py", line 85, in _parse
    self.tokenizer = _tokenizer.HTMLTokenizer(stream, parser=self, **kwargs)
  File "/usr/lib/python2.7/dist-packages/html5lib/_tokenizer.py", line 36, in __init__
    self.stream = HTMLInputStream(stream, **kwargs)
  File "/usr/lib/python2.7/dist-packages/html5lib/_inputstream.py", line 151, in HTMLInputStream
    return HTMLBinaryInputStream(source, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'encoding'

Fixing this results in another error regarding the sanitizer. See [1] and [2].

The attached patch makes planet-venus work again. It should probably be
incorporated into debian/patches/html5lib-no_XHTMLSerializer.patch.

Cheers,
sur5r

[1] https://github.com/html5lib/html5lib-python/issues/277
[2] https://github.com/html5lib/html5lib-python/issues/72



-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEe/X2rDZDH11A3BN6TPKyGPVNrj0FAlg65/MACgkQTPKyGPVN
rj1l6BAAqQyCb4TzzZ5ueiBhp5OTY7U5z+8SP4rquuD+4bMaSq6sZuDkwH/mk71E
+rXt5/EsUezRoIjvmRpOlP/1ANDNnidhoxz7OttHBiRWZQUZ/QG6HlSF4t3BOOUY
J87zTwMJJC0aM2CRod5K30EUX2eDnmbrEyMJ5DqL2aSl+V8I7tH+9ttTK7myeW25
C0y8S2D3GWCn3pjMh3PsKk6zEkX+3niERpXfXNHytlrYuBEJI4hG9xi6g7sHN9ds
dhaiopTbUonEQhHkpzKwmPc08IcMvwO/xTCecrtsiTGs1wRi5I7uxmRwySljVzDS
AuIm3cEz/Qy8SzDkDc7eWYrk7LxYE2vcJ4PZlNy75sSWoDsq0LYbmcHQq7vtrHhd
dlctzLSEx9v0MUtNcjz6iCCdFBnVdJS3VTLjCqmlt4p1c0LgbeZeuokmIhIb3s/Q
kClegb1wcuqcw3PKxMjZdUWEg7/gh84aDf/d2kb2+r+B54XXhysQM9eXpTPm24Hx
ushQZ99At/mxFEbY1UmlvUmMjfNdEV402riDUlKUGR7f+10dWvxY2cRRSZc+fXGj
cmAeT8xZa8aAZ2ou9Qmq/8/ixK9ez+A0VFgKBV69wqPzQx2fG3Omy3AY+/encjGp
cjF0QqpbRc5fswiNI9e7Y5b2E2R1kiSo6qduSB323ejYf0tQHAI=
=Lnir
-----END PGP SIGNATURE-----
-------------- next part --------------
A non-text attachment was scrubbed...
Name: html5lib-stretch.patch
Type: text/x-diff
Size: 1740 bytes
Desc: not available
URL: <http://lists.alioth.debian.org/pipermail/python-apps-team/attachments/20161127/546b24ee/attachment.patch>


More information about the Python-apps-team mailing list