[Python-apps-team] Bug#811571: ocrfeeder: Unicode characters break exports
Douglas Calvert
dfc at douglasfcalvert.net
Tue Jan 19 23:44:31 UTC 2016
Package: ocrfeeder
Version: 0.8.1-2
Severity: normal
Exporting is aborted if tesseract identifies any unicode character. Sample output:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 298, in exportDialog
self.EXPORT_FORMATS[format][1])
File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/studioBuilder.py", line 281, in exportToFormat
name)
File "/usr/lib/python2.7/dist-packages/ocrfeeder/studio/widgetModeler.py", line 606, in exportPagesWithGenerator
document_generator.save()
File "/usr/lib/python2.7/dist-packages/ocrfeeder/feeder/documentGeneration.py", line 221, in save
file.write(pages[i])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2014' in position 1263: ordinal not in range(128)
-- System Information:
Debian Release: stretch/sid
APT prefers unstable
APT policy: (990, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 4.4.0-rc8-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)
Versions of packages ocrfeeder depends on:
ii cuneiform 1.1.0+dfsg-5+b2
ii ghostscript 9.16~dfsg-2
ii gir1.2-goocanvas-2.0 2.0.2-2
ii gir1.2-gtk-3.0 3.18.6-1
ii gir1.2-gtkspell3-3.0 3.0.7-2
ii gocr 0.49-2
ii iso-codes 3.64-1
ii ocrad 0.24-1
ii python 2.7.11-1
ii python-enchant 1.6.6-2
ii python-gi 3.18.2-2
ii python-lxml 3.5.0-1
ii python-pil 3.0.0-1
ii python-reportlab 3.2.0-1
ii python-sane 2.8.2-1+b1
ii tesseract-ocr 3.04.00-5+b1
Versions of packages ocrfeeder recommends:
ii unpaper 6.1-1
ii yelp 3.16.1-1
ocrfeeder suggests no packages.
-- no debconf information
More information about the Python-apps-team
mailing list