[sane-devel] scanning for archival and OCR
Jeremy Johnson
jeremy at acjlaw.net
Wed Jan 23 16:19:21 UTC 2013
I think ghostscript must be writing one dictionary for the whole document
instead of one dictionary per page.
If I take my 16M PDFTK.pdf and re-write it using ghostscript, ghostscript
produces a 8.5M file:
$ gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=NEW.pdf PDFTK.pdf
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** This file had errors that were repaired or ignored.
**** The file was produced by:
**** >>>> itext-paulo-155 (itextpdf.sf.net-lowagie.com) <<<<
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
$ ls -sh NEW.pdf
8.5M NEW.pdf
More information about the sane-devel
mailing list