[Pkg-exim4-users] fuzzyocr

Chris Purves chris at northfolk.ca
Thu Oct 26 22:52:15 CEST 2006


Peter McEvoy wrote:
> Hi,
> would anyone happen to know how up to date this guide is:
> 
> http://www200.pair.com/mecham/spam/image_spam.html
> 
> I'm currently using the perl script from http://spam.co.nz/nsfo/ to do
> ocr on gif images ( i removed the parts that scanned jpegs ) but the
> script isnt perfect.
> 
> My concern about the above guide is that it seems a lot of patching of
> apps is required and i like to have a clean upgrade path.
> 
> If anyone has any comments on best practices for blocking image spam
> using exim on debian I'd love to hear them.
> 

I recently set up FuzzyOCR with ImageInfo using the link you provided as 
well as the spamassassin wiki:

http://wiki.apache.org/spamassassin/FuzzyOcrPlugin

When you download the FuzzyOCR plugin file there will be installation 
instructions included with that as well.  The Debian specific link is 
most useful for putting the plugin files into the same locations as the 
Debian install directories.  I found the wiki and the included 
directions more useful for installing the right dependencies.

I didn't bother with installing anything from source or patching any of 
the packages (when the Debian maintainers apply the patch, then I'll get 
it too).  I am getting good results presently.

When you go to test FuzzyOcr, make sure you disable or set very high the 
option in the config file that keeps FuzzyOcr from running if the score 
is already above a certain threshold.  For a few days I thought there 
was something wrong with the plugin because it wouldn't scan the jpg and 
png test files, but then realised it was because they already had spam 
scores of 30+ and the plugin was set not to run if the score was 10 or 
higher.

I think the documentation is sufficient, but if you run into any 
problems let me know.

-- 
Chris




More information about the Pkg-exim4-users mailing list