[sane-devel] scanimage / tesseract interoperability
Jeffrey Ratcliffe
jeffrey.ratcliffe at gmail.com
Sat May 10 14:30:48 UTC 2014
On 10 May 2014 05:56, Jeff Breidenbach <jeff at jab.org> wrote:
> Tesseract is an open source OCR program. It can already
> produce searchable PDF and will soon support streaming.
> It would be fun to support something like this:
>
> scanimage --batch | tesseract - - pdf > searchable.pdf
>
> To make this work nicely, scanimage would need to
> print the name of each file to stdout after it is written.
Try gscan2pdf, which combines scanimage (or the Sane API directly,
which is more efficient), and tesseract (or cuneiform) - all packed up
in a nice GUI.
Regards
Jeff
More information about the sane-devel
mailing list