[Python-modules-team] Bug#913530: crashes because of html5lib incompatibility

Antoine Beaupre anarcat at debian.org
Sun Nov 11 22:14:52 GMT 2018


Package: python3-bleach
Version: 2.1.3-1
Severity: critical

In current Debian buster, with the Python 3.6 interpreter, bleach
completely fails to load as a module:

$ python3
Python 3.6.7 (default, Oct 21 2018, 08:08:16) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import bleach
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/bleach/__init__.py", line 8, in <module>
    from bleach.linkifier import (
  File "/usr/lib/python3/dist-packages/bleach/linkifier.py", line 7, in <module>
    from html5lib.filters.sanitizer import allowed_protocols
ImportError: cannot import name 'allowed_protocols'

This wouldn't be such a big problem if bleach wasn't included by other
packages, like readme_renderer, the latter of which hooks into
distutils if installed. This means that basically *any* setup.py
script that looks for extra packages will crash, making unrelated
software on Debian completely break (hence the "critical"
severity). For example, here's feed2exec failing to run its test suite
under tox:

curie:feed2exec130$ tox
GLOB sdist-make: /home/anarcat/src/feed2exec/setup.py
ERROR: invocation failed (exit code 1), logfile: /home/anarcat/src/feed2exec/.tox/log/tox-0.log
ERROR: actionid: tox
msg: packaging
cmdargs: ['/usr/bin/python3', local('/home/anarcat/src/feed2exec/setup.py'), 'sdist', '--formats=zip', '--dist-dir', local('/home/anarcat/src/feed2exec/.tox/dist')]
env: None

running sdist
running egg_info
writing feed2exec.egg-info/PKG-INFO
writing dependency_links to feed2exec.egg-info/dependency_links.txt
writing entry points to feed2exec.egg-info/entry_points.txt
writing requirements to feed2exec.egg-info/requires.txt
writing top-level names to feed2exec.egg-info/top_level.txt
writing manifest file 'feed2exec.egg-info/SOURCES.txt'
running check
Traceback (most recent call last):
  File "setup.py", line 153, in <module>
    classifiers=classifiers,
  File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 140, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/usr/lib/python3/dist-packages/setuptools/command/sdist.py", line 52, in run
    self.run_command(cmd_name)
  File "/usr/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.6/distutils/dist.py", line 972, in run_command
    cmd_obj = self.get_command_obj(command)
  File "/usr/lib/python3.6/distutils/dist.py", line 846, in get_command_obj
    klass = self.get_command_class(command)
  File "/usr/lib/python3/dist-packages/setuptools/dist.py", line 635, in get_command_class
    self.cmdclass[command] = cmdclass = ep.load()
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2343, in load
    return self.resolve()
  File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2349, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/usr/lib/python3/dist-packages/readme_renderer/integration/distutils.py", line 24, in <module>
    from ..rst import render
  File "/usr/lib/python3/dist-packages/readme_renderer/rst.py", line 23, in <module>
    from .clean import clean
  File "/usr/lib/python3/dist-packages/readme_renderer/clean.py", line 18, in <module>
    import bleach
  File "/usr/lib/python3/dist-packages/bleach/__init__.py", line 8, in <module>
    from bleach.linkifier import (
  File "/usr/lib/python3/dist-packages/bleach/linkifier.py", line 7, in <module>
    from html5lib.filters.sanitizer import allowed_protocols
ImportError: cannot import name 'allowed_protocols'

ERROR: FAIL could not package project - v = InvocationError('/usr/bin/python3 /home/anarcat/src/feed2exec/setup.py sdist --formats=zip --dist-dir /home/anarcat/src/feed2exec/.tox/dist (see /home/anarcat/src/feed2exec/.tox/log/tox-0.log)', 1)

(and before we point the finger at python3-readme-renderer, let's just
remember it's a dependency of twine, which is an important tool to
talk with pip. uninstalling it is possible, but severely handicaps
developers as well.)

html5lib seems to like to change its public API like this
gratiously. I've seen similar errors in unrelated packages in my
search for this bug:

https://github.com/ArchiveTeam/wpull/issues/332
https://github.com/tensorflow/tensorboard/issues/588

The former "fixed" the issue by limiting the html5lib to pre-1.0
releases, the latter by vendoring html5lib, none of which seem like a
satisfactory solution.

Upstream bleach also "fixed" this by vendoring html5lib 1.0.1, in
their 3.0 version released earlier in october:

https://github.com/mozilla/bleach/issues/386

I can confirm that the `allowed_protocols` name is not exported by the
1.0.1 version of html5lib.

The simplest fix for this would probably be to upgrade bleach to the
latest release. Indeed, this command works around the problem
completely:

    sudo pip install bleach

-- System Information:
Debian Release: buster/sid
  APT prefers testing
  APT policy: (500, 'testing'), (1, 'experimental'), (1, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.18.0-2-amd64 (SMP w/4 CPU cores)
Locale: LANG=fr_CA.UTF-8, LC_CTYPE=fr_CA.UTF-8 (charmap=UTF-8), LANGUAGE=fr_CA.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages python3-bleach depends on:
ii  python3           3.6.7-1
ii  python3-html5lib  1.0.1-1
ii  python3-six       1.11.0-2

python3-bleach recommends no packages.

Versions of packages python3-bleach suggests:
pn  python-bleach-doc  <none>

-- no debconf information



More information about the Python-modules-team mailing list