Counting Values in Small Regions

QUESTION: I have a 2D array, which contains data values ranging from 1 to 8. I would like to consider a 5-by-5 small region in the image. What I am interested in knowing is how many pixels in this 2D array have a specific value inside this region or window. I would like to perform this operation for each pixel in the 2D array, and I would like to replace the pixel value with the number of pixels in the window having a specific value (that is, a value of 1 through 8). I can think of several ways to do this, but all involve a lot of looping. Can you show me the IDL Way to do this?

ANSWER: OK, let me show you one way I would approach this problem. First, let's create a 2D data set with random pixels set to values 1 through 8. I'll make the data only 10-by-10 so it is easier to see the results.

   array = Fix(cgScaleVector(Randomu(-3L, 10, 10), 1.0, 8.999))

So we can see what we are working on, let's print the data.

   IDL> Print, array, FORMAT='(10I3)'      8  5  7  5  1  8  1  2  2  8
     8  8  7  2  2  7  1  6  1  1      6  1  2  1  3  3  6  6  3  4
     5  2  8  6  5  7  7  4  5  8      6  7  2  3  7  7  1  6  2  4
     4  4  8  4  6  5  3  5  2  2      7  8  4  3  2  5  7  4  1  2
     7  2  6  2  1  8  5  2  7  4      7  3  4  3  5  8  2  2  6  3
     5  4  5  6  8  3  1  5  1  5

Imagine that we now place a 5-by-5 box on top of the image, as in the figure below. The center pixel in the box has a value of 7. We wish to replace this value in the 2D array (actually in a copy of the 2D array) with the number of pixels in this box that have a value of, say, 3. In this case, the answer will be 5, since there are five pixels with the value of 3 inside the box. If we wanted to know how many pixels have the value of 5, the answer in this case would be 3, and so forth. Then we will slide the box one pixel to the right and do the calculation again, etc. We will do the calculation for each pixel in the 2D array.

A 5-by-5 box is placed on top of the 2D array.

This kind of sliding box or window operation is often performed in IDL with a double FOR loop, but, believe me, this is not the IDL Way! In fact, it is painfully slow for 2D arrays of any reasonable size. Rather, we want to use the CONVOL function in IDL to perform this kind of sliding window calculation.

In this kind of convolution, we supply a 5-by-5 kernel and the operation multiplies the kernel times the underlying values in the 2D array, and replaces the center pixel with the total of this multiplication. In other words, the center pixel is replaced by the addition of the 25 underlying multiplications.

This probably doesn't sound particularly helpful yet, but stay with me. Suppose we could create a mask array, such that when the original 2D array is multiplied by the mask array only pixels values with a specified value were retained. For example, if we were interested in the value 5, we could do this.

   IDL> Print, array * mask, FORMAT='(10I3)'      0  5  0  5  0  0  0  0  0  0
     0  0  0  0  0  0  0  0  0  0      0  0  0  0  0  0  0  0  0  0
     5  0  0  0  5  0  0  0  5  0      0  0  0  0  0  0  0  0  0  0
     0  0  0  0  0  5  0  5  0  0      0  0  0  0  0  5  0  0  0  0
     0  0  0  0  0  0  5  0  0  0      0  0  0  0  5  0  0  0  0  0
     5  0  5  0  0  0  0  5  0  5

We could create just such a mask for doing something like this by using the Where function like this.

   s = Size(array, /DIMENSIONS)    mask = BytArr(s[0], s[1])
   indices = Where(array EQ 5, count)    IF count GT 0 THEN mask[indices] = 1

In other words, find the indices in the 2D array where the value is 5. And set those indices in a mask array to the value 1. Leave all the other values in the mask array set to the value 0.

Now, if we create a 5-by-5 kernel, where the value of each element in the kernel is 1/5, then if we perform the multiplication of this 25-element array times the masked 2D array, and add the results (5 times 1/5 will be equal to 1, and 0 times 1/5 will be equal to 0), the total will be equal to the total number of pixels that have the value of 5. In the case of the box shown above, the answer is three.

So, with this general idea in mind, it is possible to write a general purpose function that returns the answer to us. Note that in the code below, we are going to use the EDGE_ZERO keyword to the CONVOL function. This will treat "missing" values in the 2D array as having a value of 0 in the calculations. (Such "missing" values will occur anytime we get the sliding window within less than two pixels from an edge of the 2D array.)

   FUNCTION Small_Region_Values, array, value, KERNEL=kernel 
      Compile_Opt idl2        On_Error, 2 ; Return to caller on error. 
      ; Check parameters.
      IF N_Elements(array) EQ 0 THEN Message, 'Must pass 2D array.'
      IF N_Elements(value) EQ 0 THEN value = 1
      IF N_Elements(kernel) EQ 0 THEN kernel = Replicate(1.0/value, 5, 5) 
      ; Create a mask array.       s = Size(array, /DIMENSIONS)
      mask = BytArr(s[0], s[1])       indices = Where(array EQ 5, count)
      IF count GT 0 THEN mask[indices] = 1 
      RETURN, Convol(Float(array) * mask, kernel, /EDGE_ZERO)     END

Let's try it.

   IDL> Print, Small_Region_Values(array, 5), FORMAT='(10I3)'
     1  2  2  2  1  1  0  0  0  0      2  3  4  3  2  2  2  1  1  1
     2  3  4  3  2  2  2  1  1  1      1  1  2  2  2  3  4  3  2  2
     1  1  2  3  3  4  5  4  2  2      1  1  2  3  4  5  6  5  3  2
     0  0  1  3  4  5  5  4  2  1      2  2  3  4  5  6  6  6  4  3
     2  2  3  3  4  4  4  4  3  2      2  2  3  2  3  3  3  3  3  2

This looks good. Let's see the results side by side with the original 2D array.


Results at right, next to original 2D array.

Return All Values Together

What if you wanted to return the result array for all eight of the possible values? In other words, what if you wanted to return a m-by-n-by-8 answer array, rather than a 2D answer array. In this case, the Where function might be the slow step in the process. It might be better to partition the 2D array with Value_Locate and the Histogram commands, using the Reverse_Indices keyword of Histogram to locate the pixels for the array mask.

Your code might look like this.

    FUNCTION Small_Values, array        Compile_Opt idl2 
      On_Error, 2 ; Return to caller on error.        ; Check parameters.
      IF N_Elements(array) EQ 0 THEN Message, 'Must pass 2D array.' 
      ; Create answer and mask arrays.       s = Size(array, /DIMENSIONS)
      answers = LonArr(s[0], s[1], 8)       mask = BytArr(s[0], s[1])       
      ; Partition the array values.       cutoff = [1,2,3,4,5,6,7,8]
      h = Histogram(Value_Locate(cutoff, array) + 2, REVERSE_INDICES=ri) 
      ; Find the answer arrays.       FOR j=0,7 DO BEGIN
           mask = mask * 0B            kernel = Replicate(1.0/(j+1), 5, 5)
           mask[ri[ri[j]:ri[j+1]-1]] = 1
           answers[0,0,j] = Convol(Float(array) * mask, kernel, /EDGE_ZERO)
      ENDFOR        RETURN, answers     END

Let's check to be sure we get the same answer. Print the answer array for the value 5.

    IDL> Print, (Small_Values(array))[*,*,4], FORMAT='(10I3)'
     1  2  2  2  1  1  0  0  0  0      2  3  4  3  2  2  2  1  1  1
     2  3  4  3  2  2  2  1  1  1      1  1  2  2  2  3  4  3  2  2
     1  1  2  3  3  4  5  4  2  2      1  1  2  3  4  5  6  5  3  2
     0  0  1  3  4  5  5  4  2  1      2  2  3  4  5  6  6  6  4  3
     2  2  3  3  4  4  4  4  3  2      2  2  3  2  3  3  3  3  3  2

Yep. Exactly the same.

Version of IDL used to prepare this article: IDL 7.0.3.