[sane-devel] scanning for archival and OCR
jeremy at acjlaw.net
Wed Jan 23 22:57:26 UTC 2013
Generally, 1200 dpi resolution for text would be overkill unless you have a
document with extremely tiny print (1-2 point instead of 10-12 point).
They used to recommend 150 dpi or even 75 dpi for scanning documents
containing just plain text. But I scan at 300 dpi and also print ordinarily at
300 dpi which for me is adequate quality for plain text documents.
My photo scanner can scan at 2400 dpi but my printer can only print at a max
of 1200 dpi resolution. Scanning at a higher resolution than that at which one
prints can be useful for enlarging a portion of the image. Otherwise it's
probably just a waste of disc space.
Similarly, I ordinarily wouldn't scan bills, invoices, receipts, etc. in
color, since a Black&White 1-bit image would suffice for my needs. If someone
were to ask me for a copy of a receipt or check, even a G3 fax would probably
be good enough.
If I have a document with pages mixing text and color graphics/photos, I
ordinarily scan at full color depth and use djvu wavelet compression, which
generates reasonably small file sizes without sacrificing too much text clarity.
A typical grayscale scan of a black&white letter-sized document would result,
after binarization, in a pdf filesize of 30-40K per page (depending on line
spacing, line length, text size, text weight, etc)
A typical color scan with djvu wavelet compression would be about 10X as large
(again depending on the mix of text/graphics)
More information about the sane-devel