[Python-modules-team] Bug#763990: white spaces changed to &nbsp_place_holder;

Steve B b.steve at gmx.com
Sat Oct 4 14:54:18 UTC 2014


Package: python-html2text
Version: 3.200.3-2

I use rss2email to parse RSS feeds and read the generated emails with 
Icedove.
rss2email is set to convert HTML to plain text:
HTML_MAIL = 0
Icedove is set to view message body as plain text.

Some white spaces in RSS feeds are changed to &nbsp_place_holder;
The resulting text is hard to read.
Here is an example from the page:
http://www.nextinpact.com/news/90246-twitch-et-valve-veulent-plus-transparence-sur-contenus-sponsorises.htm

The RSS feed contains only the first paragraph of the web page.
The resulting text in Icedove is:
[...] Twitch et Valve semblent vouloir&nbsp_place_holder;combattre.

What I expect is:
[...] Twitch et Valve semblent vouloir combattre.

I see some references to &nbsp_place_holder; in python-html2text source 
code but can't understand the goal of these.

I run an uptodate Wheezy:
Debian 3.2.60-1+deb7u3 x86_64 GNU/Linux
Python 2.7.3-4+deb7u1
Icedove 24.8.1-1~deb7u1
rss2email 1:2.71-1
python-html2text 3.200.3-2
python-feedparser 5.1.2-1



More information about the Python-modules-team mailing list