[Pkg-exim4-users] fuzzyocr
Chris Purves
chris at northfolk.ca
Thu Oct 26 22:52:15 CEST 2006
Peter McEvoy wrote:
> Hi,
> would anyone happen to know how up to date this guide is:
>
> http://www200.pair.com/mecham/spam/image_spam.html
>
> I'm currently using the perl script from http://spam.co.nz/nsfo/ to do
> ocr on gif images ( i removed the parts that scanned jpegs ) but the
> script isnt perfect.
>
> My concern about the above guide is that it seems a lot of patching of
> apps is required and i like to have a clean upgrade path.
>
> If anyone has any comments on best practices for blocking image spam
> using exim on debian I'd love to hear them.
>
I recently set up FuzzyOCR with ImageInfo using the link you provided as
well as the spamassassin wiki:
http://wiki.apache.org/spamassassin/FuzzyOcrPlugin
When you download the FuzzyOCR plugin file there will be installation
instructions included with that as well. The Debian specific link is
most useful for putting the plugin files into the same locations as the
Debian install directories. I found the wiki and the included
directions more useful for installing the right dependencies.
I didn't bother with installing anything from source or patching any of
the packages (when the Debian maintainers apply the patch, then I'll get
it too). I am getting good results presently.
When you go to test FuzzyOcr, make sure you disable or set very high the
option in the config file that keeps FuzzyOcr from running if the score
is already above a certain threshold. For a few days I thought there
was something wrong with the plugin because it wouldn't scan the jpg and
png test files, but then realised it was because they already had spam
scores of 30+ and the plugin was set not to run if the score was 10 or
higher.
I think the documentation is sufficient, but if you run into any
problems let me know.
--
Chris
More information about the Pkg-exim4-users
mailing list