[sane-devel] gomr 1.0 released

m. allan noah kitno455 at gmail.com
Mon Aug 20 16:50:38 UTC 2007

I have recently published a small C library, which might be of some
use to folks on this list.

gomr is a GPLv3 library which provides basic Optical Mark Recognition (OMR)
and Code 3of9 barcode reading (as well as some scan cleaning features).

It was developed to score student 'bubble' sheets, and has been
refined over the last few years for speed and accuracy. to date, more
than 4 million images have been processed, with a very low error rate.
gomr can process more than 2000 pages per minute on a single (albeit
fast) cpu.

the code relies chiefly on features of the sheet in order to correct
scanning errors, so these might not be useful in your case:

* rotation is identified by trying to 'strike' lines across the page at
a variety of angles, finding the angle which produces the most
entirely white lines. the searching algorithm uses interlacing and
caching for speed. perhaps a future version will support
black-background as well.

* horizontal/vertical offset and upside-down scans are identified by
locating a prominent barcode, and centering that with the caller's

* the 'bubbles' are assumed to come in large blocks, with reasonable
internal and external whitespace.

There are also some more general functions:

* remove speckles
* remove trash stripes
* make GIF thumbnails

speed is of utmost importance in the commercial operation of gomr, so:

* only low-resolution binary data is required.
* gomr will open zlib compressed images in ram
* rotation is corrected using a 'double-sheer' algorithm.
* the barcode algo is split into two parts- a fast 'finder' and a 'reader'.
* etc...

The code works well in our use, but will most likely NOT work for you
without modification. Its GPL, so you can do that yourself, or you can
contract with us, but make sure you understand and follow the license!



"The truth is an offense, but not a sin"

More information about the sane-devel mailing list