[sane-devel] Dust removal inSANE ?

Michael Rickmann mrickma at gwdg.de
Mon Jul 11 16:54:22 UTC 2011

I learned from other posts in this list that SANE does not provide much 
support for infrared. In sane.h some related definitions are even 
commented out. Only in coolscan.c I have found some RGBIfix routines by 
Andreas Rick which must be related to what he describes on his page at 
http://andreas.rick.free.fr/sane/dustremove.html. With my attempts to 
support PIE film scanners in the pie backend I have reached a stage 
where I can receive R, G, B and I color planes at resolutions from 300 
to 3600 dpi at 16 bit color depth. I wish to use the infrared channel 
for dirt removal without touching current SANE specifications. 
Essentially three things have to be done:
1) reduce red spectral overlap from the infrared (ired) plane
2) find the dirt
3) replace the dirt
Everything beyond depends on the kind of film and taste. If you are 
interested in quick results skip down to the last paragraphs of this post.
I rolled up my sleeves though in a different way than suggested on this 
list and made a small program ircleanest.c in which I tried to implement 
above three steps. For trying image calculations before programming 
"ImageJ" (http://rsbweb.nih.gov/ij/)  has been great help.

Ad 1) Quite often the ired image looks like a greyscale image with dirt 
emphasized. Something similar was reported for the Epson V700 scanner in 
). The ired plane of negative films usually only contains slight shades 
of the image but slide films may show a considerable amount of it. I 
tried something with gamma and linear operations what Andreas Rick 
describes for the Coolscan. I could not get it to work and I do not wish 
to craft parameters by hand for every slide.
When plotting the ired value of 1000 randomly chosen pixels against the 
red value the relation ired = b + a * ln (red) always gave a good fit. 
So calculating an ired' = ired - a  * ln (red) should clean the ired 
plane. It works. I randomly sample 2000 pixels, calculate the parameter 
"a" by linear regression from the ln (red) and ired values, produce an 
ired' plane and scale it between 0 and 65535. I also tried to include 
green and blue planes but there is a lot of calculation and no real 
benefit. A similar cleaning effect one gets with the relations ired = b 
* red ^ a and ired' = ired / red ^ a. This comes closer Andreas Rick's 
suggestion of applying a gamma, and the coefficients can also be 
determined by linear regression.

Ad 2) First I tried static thresholds to find the dirt. I still use two 
of them, Otsu's and Yen's in M. Emre Celebi's implementation in the 
FOURIER 0.8 project ( http://sourceforge.net/projects/fourier-ipal ). 
Yen's threshold in this implementation assumes a bimodal distribution 
and was the best of the static thresholds I tried in detecting only 
dirt. I still use it to add large dirty areas. But soon I gave up 
detecting smaller dirt with static thresholding without user 
intervention. On my search for an adaptive threshold I stumbled over the 
MAD (median of the absolute deviations from the median) filter 
(Crnojevic V. (2005) "Impulse Noise Filter with Adaptive Mad-Based 
Threshold. Proc. of the IEEE Int. Conf. on Image Processing, 3: 
337-340). It is an understandable paper describing an algorithm of 
rather low complexity. Median filtering, however, is rather slow. 
First,  I replaced the first median filter step of the original paper 
with a maximum filter because the dirty pixels are always darker than 
the real signal. Then I managed to get images from the scanner at 
maximum resolution and realized some impulse noise. So I resorted to a 
mean filter. The second median I also replaced by a mean filter to 
reduce computation time. In spite of these changes Crnojevic's 
recommendations for the choice of the parameters "a" and "b" were still 
valid when scaled to 16 bit. In my ircleanest.c program it is the 
filter_madmean routine. Combining the madmean dirt mask with the one 
from Yen's static threshold gave a good representation of what I felt 
had to be removed.

Ad 3) For Replacing the dirt I dilate the clean image parts into the 
dirty ones. As I wish to do that in one sweep several pixels deep I 
first calculate the Manhattan distance of dirty pixels to their closest 
clean neighbors and keep an index of these clean ones. The result is ok 
in general but looks funny when the original dirt was overlapping a 
region of high colour changes. So I adapt the dilated pixels by a mean 
filter to their new surroundings and replace them again. Clean pixels 
remain unchanged by this procedure.

You find examples of my dust removal at 
http://wwwuser.gwdg.de/~mrickma/sane-proscan-7200/status-110711/. The 
ircleanest.c is in the files.tar.gz. All you have to tell ircleanest is 
the resolution at which the scan was taken. An approximately 5 year old 
Pentium 4, 3.40GHz needs about 14 secs to clean a 4979 * 3330 image 
(slide scanned at 2700 dpi) though gprof reports only 5.09 secs. A two 
year old Phenom x4 64-bit needs 7 - 8 secs with gprof reporting about 
4.75 secs.

Would code for dirt removal be acceptable in a SANE backend or in 

More information about the sane-devel mailing list