[Python-apps-team] Bug#646605: ocrfeeder: too much memory required when opening multipage documents
Michael Below
mbelow at antithese.de
Tue Oct 25 16:10:43 UTC 2011
Package: ocrfeeder
Version: 0.7.6-1
Severity: normal
Dear Maintainer,
I tried ocrfeeder with a 26 page document and found it hard to
use. When I imported the directory containing my images (300dpi
greyscale scans from A4, in total 216 MByte as pnm) the memory usage kept
rising and rising. I have 2 GByte of RAM, ocrfeeder filled them
for about 80%, the system was waiting for the hard disk for
minutes. I had the impression that ocrfeeder loads all images into
memory at the same time and uncompresses them there.
I think an "ocr feeder" should handle multipage documents better
than this. For example gscan2pdf is able to import such documents
without bringing my system to near-halt.
Thanks for your work!
Michael Below
-- System Information:
Debian Release: wheezy/sid
APT prefers testing
APT policy: (900, 'testing'), (500, 'stable-updates'), (500, 'proposed-updates'), (500, 'stable'), (10, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.0.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages ocrfeeder depends on:
ii cuneiform 1.1.0+dfsg-1
ii ghostscript 9.04~dfsg-2
ii gocr 0.48-1
ii python 2.7.2-9
ii python-enchant 1.6.5-2
ii python-gnome2 2.28.1-3
ii python-gtk2 2.24.0-2
ii python-gtkspell 2.25.3-10.1
ii python-imaging-sane 1.1.7-4
ii python-pygoocanvas 0.14.1-1+b3
ii python-reportlab 2.5-1.1
ii python2.6 2.6.7-3
ii python2.7 2.7.2-5
ii tesseract-ocr 2.04-2.1
Versions of packages ocrfeeder recommends:
ii unpaper 0.3-1
ii yelp 2.30.1+webkit-1+b1
ocrfeeder suggests no packages.
-- no debconf information
More information about the Python-apps-team
mailing list