[Python-modules-team] Bug#639390: python-html5lib: lxml builder: ValueError: Invalid attribute name
Jakub Wilk
jwilk at debian.org
Fri Aug 26 17:41:03 UTC 2011
Package: python-html5lib
Version: 0.90-2
>>> import html5lib
>>> html5lib.parse('<div><div><a/</div></div>\n', treebuilder='lxml')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 38, in parse
return p.parse(doc, encoding=encoding)
File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 211, in parse
parseMeta=parseMeta, useChardet=useChardet)
File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 111, in _parse
self.mainLoop()
File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 176, in mainLoop
self.phase.processSpaceCharacters(token)
File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 952, in processSpaceCharacters
self.tree.reconstructActiveFormattingElements()
File "/usr/lib/pymodules/python2.7/html5lib/treebuilders/_base.py", line 181, in reconstructActiveFormattingElements
clone = entry.cloneNode() #Mainly to get a new copy of the attributes
File "/usr/lib/pymodules/python2.7/html5lib/treebuilders/etree.py", line 136, in cloneNode
element.attributes[name] = value
File "lxml.etree.pyx", line 2145, in lxml.etree._Attrib.__setitem__ (src/lxml/lxml.etree.c:46818)
File "apihelpers.pxi", line 558, in lxml.etree._setAttributeValue (src/lxml/lxml.etree.c:15734)
File "apihelpers.pxi", line 1554, in lxml.etree._attributeValidOrRaise (src/lxml/lxml.etree.c:24197)
ValueError: Invalid attribute name u'<'
Funnily enough, the problem goes away if I remove the trailing newline.
-- System Information:
Debian Release: wheezy/sid
APT prefers unstable
APT policy: (990, 'unstable'), (500, 'experimental')
Architecture: i386 (x86_64)
Kernel: Linux 3.0.0-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages python-html5lib depends on:
ii python 2.7.2-5 interactive high-level object-orie
ii python-support 1.0.14 automated rebuilding support for P
Versions of packages python-html5lib suggests:
pn python-beautifulsoup <none> (no description available)
ii python-chardet 2.0.1-2 universal character encoding detec
pn python-genshi <none> (no description available)
ii python-lxml 2.3-0.1+b2 pythonic binding for the libxml2 a
--
Jakub Wilk
More information about the Python-modules-team
mailing list