Local Standard Deviation of an Image

QUESTION: I need to apply a smoothing type kernel across an image, and calculate the standard deviation of the pixels masked by this kernel. How can I do this in IDL without using loops?

ANSWER: The definitive answer to this question, first posed in the IDL Newsgroup in November 2000, was provided by Martin Downing (m.downing@abdn.ac.uk). I extract parts of the discussion here. First, we hear from Martin:

MARTIN: I was wandering through new Craig's IDL archive site (which is brilliant by the way) and came across this question asking for an efficient way of calculating the local standard deviation in an array. It seemed to me that the thread had not reached a full solution so perhaps some of you might be interested in this method which is very fast. It is based on the crafty formula for variance:
   variance = (sum of the squares)/n - (square of the sums)/n*n
(I apologize if this is going over old ground!)

FUNCTION IMAGE_VARIANCE , image, halfWidth, MEAN_IM=av_im,  $
        NEIGHBOURHOOD=NEIGHBOURHOOD,$
        POPULATION_ESTIMATE=POPULATION_ESTIMATE
;+
; NAME:
;
; IMAGE_VARIANCE
;
; PURPOSE:
;
;   This function calculates the local-neighbourhood statistical variance.
;   I.e. for each array element a the variance of the neighbourhood of +- 
;   halfwidth is calculated. The routine avoids any loops and so is fast 
;   and "should" work for any dimension of array.
;
; CATEGORY:
;
; Image Processing
;
; CALLING SEQUENCE:
;
;    Result = IMAGE_VARIANCE(Image, HalfWidth)
;
; INPUTS:
;
;   Image: The array of which we calculate the variance. Can be any dimension.
;
;   HalfWidth: The half width of the NEIGHBOURHOOD, indicates we are
;     looking at a neigborhood +/- N from the pixel in each dimension.
;
; KEYWORD PARAMETERS:
;
;   NEIGHBOURHOOD: Calculate for the NEIGHBOURHOOD only, not the central pixel.
;
;   POPULATION_ESTIMATE: Returns the population estimate of variance, not the
;     sample variance.
;
; OUTPUT:
;
;    Returns an array of same dimensions as input array in which each pixel
;    represents the local variance centred at that position.
;
; OPTIONAL OUTPUTS:
;
;   MEAN_IM: Set to array of local area mean, same dimensionality as input.
;
; RESTRICTIONS:
;
;    Edges are dealt with by replicating border pixels this is likely to
;    give an underestimate of variance in these regions
;
; PROCEDURE:
;
;  Based on the formula for variance:
;
;     var = (sum of the squares)/n - (square of the sums)/n*n
;
; EXAMPLE:
;
; Example of simple statistical-based filter for removing spike-noise
;
;     var_im =  image_variance(image,  5, mean=mean_im, /neigh)
;     zim = (image-mim)/sqrt(var_im)
;     ids = where(zim gt 3, count)
;     if count gt 0 then image[ids] = mean_im[ids]
;
; MODIFICATION HISTORY:
;
;  Written by: Martin Downing, 30th September 2000 (m.downing@abdn.ac.uk).
;-

    ; full mask size as accepted by SMOOTH()
n = halfWidth*2+1

    ; this keyword to SMOOTH() is always set
EDGE_TRUNCATE= 1
 
    ; sample size
m = n^2
    
    ; temporary double image copy to prevent overflow
im = double(image)
 
    ; calc average
av_im = smooth(im, n, EDGE_TRUNCATE=EDGE_TRUNCATE)
 
    ; calc squares image
sq_im = temporary(im)^2
 
    ; average squares
asq_im = smooth(sq_im, n, EDGE_TRUNCATE=EDGE_TRUNCATE)

IF  keyword_set(NEIGHBOURHOOD) THEN BEGIN
        ; remove centre pixel from estimate
        ; calc neighbourhood average (removing centre pixel)
     av_im = (av_im*m - image)/(m-1)
        ; calc neighbourhood average of squares (removing centre pixel)
     asq_im = (asq_im*m - temporary(sq_im))/(m-1)
        ; adjust sample size
     m = m-1
ENDIF

var_im =  temporary(asq_im) - (av_im^2)
IF keyword_set(POPULATION_ESTIMATE) THEN BEGIN
     var_im = var_im *( double(m)/(m-1))
ENDIF

return, var_im

END

After looking over the code, JD Smith had this to add:

JD: Righto. I knew I was fishing for something like this. Except I think you mean:
   population_variance = (sum of the squares)/n - (square of the sums)/n*n
Sample variance (=population_variance*n/(n-1)) is, of course, the more common case in science (as opposed to gambling).

Martin replied:

MARTIN: Sigh - I hear what you are saying, but this was a misunderstanding. I tried to make its use unambiguous by making the default option the absolute variance of the array (n as the denominator), or when POPULATION_ESTIMATE is set, then calculate an estimate of the population from which this dataset is assumed to be a SAMPLE [giving (n-1) as the denominator]. Judging by your reply I failed dismally!

You are right - POPULATION_ESTIMATE is normally termed "sample stdev" and is the equivalent of IDL's variance(x) - but what they mean is that it is an estimator of the popn stdev! Still waiting to try it in the casinos :)

JD also mentioned that he got the original idea for writing code like this from the SIGMA_FILTER program in the NASA Goddard Space Flight Center IDL archive, first written back in 1991.

Martin replied.

Martin: Just checked the file SIGMA_FILTER at the NASA site. I really must spend more time browsing these great sites.

The code is similar, however it does not calculate the true variance under the mask. They calculate for a box width of n, (ignoring centre pixel removal):
   mean_im=(smooth(image, n) )
   dev_im = (image - mean_im)^2
   var_im = smooth(dev_im, n)/(n-1)
This is not the true variance of the pixels under the box mask, as each pixel in the mask is having a different mean subtracted. For example (read this as a formula if you can!):
   Pseudo_Variance = SUM ij ( ( I(x+i,y+j) - MEAN(x+i,y+j)^2) /(n-1)
instead of true variance:
   Variance = SUM ij ( ( I(x+i,y+j) - MEANxy)^2) /(n-1)
which can be reduced to:
   {(SUM ij ( ( I(x+i,y+j)^2 ) - (SUM ij I(x+i,y+j) ) ^2)/n }/(n-1)
hence the non-loop method we use below:
     ; calc box mean
   mean_im = smooth(image, n)
     ; calc box mean of squares
   msq_im = smooth(image^2, n)
     ; hence variance
    var_im = ( msq_im - mean_im^2) * (n/(n-1.0))