[sane-devel] XSane and Tesseract

Mike CALDER mikecalder at optusnet.com.au
Wed Aug 4 12:30:26 UTC 2010

Thanks Jeff,

Always happy to give all the options a go and I have installed gscan2pdf
and rescanned my original test.

Not sure whether the list will get the attachments but I assume you will.

result.txt is what I obtained using tesseract from the command line
having scanned just that part of a letter with XSane.

I then corrected that text to give DFRDB.odt which was the original.

DFRDB2.txt is the result that I have cut from the scan of the WHOLE
letter using gscan2pdf and tesseract as the chosen ENGINE and you can
see it is not as good as result.txt.

I think this is because I had to scan the WHOLE letter including the
crap of letter head, hand written signature and even some pencil notes I
had written on the bottom. This may well have confused the OCR.

I have just imported out.gif into gscan2pdf and then done an OCR
[tesseract]from Tools -> OCR and as you can see from DFRDB3.txt the
results are identical.

I may be wrong but even though there is a PREVIEW tab I do not seem to
be able to do a preview scan and the select a PART for a partial scan.

Hope this explanation is not too complicated.

Look forward to your comments.

[Re-posted without Attachments]

Thanks again...


On 08/04/2010 01:08 AM, Jeffrey Ratcliffe wrote:
> Hi Mike,
> On 3 August 2010 14:42, Mike CALDER<mikecalder at optusnet.com.au>  wrote:
>> Now I want to know how to set up XSane ->  Preferences ->  Setup ->  OCR with
>> the correct OCR Command and the other inputs to that dialog in order to use
>> tesseract directly from the Viewer Mode.
> At the risk of blowing my own trumpet, I would suggest that you try
> gscan2pdf, which handles scanning and OCR out of the box.
> Regards
> Jeff



More information about the sane-devel mailing list