[sane-devel] XSane and Tesseract
mikecalder at optusnet.com.au
Wed Aug 4 12:30:26 UTC 2010
Always happy to give all the options a go and I have installed gscan2pdf
and rescanned my original test.
Not sure whether the list will get the attachments but I assume you will.
result.txt is what I obtained using tesseract from the command line
having scanned just that part of a letter with XSane.
I then corrected that text to give DFRDB.odt which was the original.
DFRDB2.txt is the result that I have cut from the scan of the WHOLE
letter using gscan2pdf and tesseract as the chosen ENGINE and you can
see it is not as good as result.txt.
I think this is because I had to scan the WHOLE letter including the
crap of letter head, hand written signature and even some pencil notes I
had written on the bottom. This may well have confused the OCR.
I have just imported out.gif into gscan2pdf and then done an OCR
[tesseract]from Tools -> OCR and as you can see from DFRDB3.txt the
results are identical.
I may be wrong but even though there is a PREVIEW tab I do not seem to
be able to do a preview scan and the select a PART for a partial scan.
Hope this explanation is not too complicated.
Look forward to your comments.
[Re-posted without Attachments]
On 08/04/2010 01:08 AM, Jeffrey Ratcliffe wrote:
> Hi Mike,
> On 3 August 2010 14:42, Mike CALDER<mikecalder at optusnet.com.au> wrote:
>> Now I want to know how to set up XSane -> Preferences -> Setup -> OCR with
>> the correct OCR Command and the other inputs to that dialog in order to use
>> tesseract directly from the Viewer Mode.
> At the risk of blowing my own trumpet, I would suggest that you try
> gscan2pdf, which handles scanning and OCR out of the box.
More information about the sane-devel