# Counting Values in Small Regions

**QUESTION:** I have a 2D array, which contains data values ranging from 1 to 8. I would
like to consider a 5-by-5 small region in the image. What I am interested in knowing is how many
pixels in this 2D array have a specific value inside this region or window. I would like to perform
this operation for each pixel in the 2D array, and I would like to replace the pixel value with the
number of pixels in the window having a specific value (that is, a value of 1 through 8). I can think
of several ways to do this, but all involve a *lot *of looping. Can you show me the
IDL Way to do this?

** ANSWER:** OK, let me show you one way I would approach this problem. First, let's create
a 2D data set with random pixels set to values 1 through 8. I'll make the data only 10-by-10 so it
is easier to see the results.

array = Fix(cgScaleVector(Randomu(-3L, 10, 10), 1.0, 8.999))

So we can see what we are working on, let's print the data.

IDL> Print, array, FORMAT='(10I3)' 8 5 7 5 1 8 1 2 2 8 8 8 7 2 2 7 1 6 1 1 6 1 2 1 3 3 6 6 3 4 5 2 8 6 5 7 7 4 5 8 6 7 2 3 7 7 1 6 2 4 4 4 8 4 6 5 3 5 2 2 7 8 4 3 2 5 7 4 1 2 7 2 6 2 1 8 5 2 7 4 7 3 4 3 5 8 2 2 6 3 5 4 5 6 8 3 1 5 1 5

Imagine that we now place a 5-by-5 box on top of the image, as in the figure below.
The center pixel in the box has a value of 7. We wish to replace this value in the 2D
array (actually in a *copy *of the 2D array) with the number of pixels in this
box that have a value of, say, 3. In this case, the answer will be 5, since there are
five pixels with the value of 3 inside the box. If we wanted to know how many pixels
have the value of 5, the answer in this case would be 3, and so forth.
Then we will slide the box one pixel to
the right and do the calculation again, etc. We will do the calculation for each pixel
in the 2D array.

A 5-by-5 box is placed on top of the 2D array. |

This kind of sliding box or window operation is often performed in IDL
with a double FOR loop, but, believe me, this is not the IDL Way! In fact,
it is painfully slow for 2D arrays of any reasonable size. Rather, we want to
use the **CONVOL **function in IDL to perform this kind of sliding window calculation.

In this kind of convolution, we supply a 5-by-5 kernel and the operation multiplies the kernel times the underlying values in the 2D array, and replaces the center pixel with the total of this multiplication. In other words, the center pixel is replaced by the addition of the 25 underlying multiplications.

This probably doesn't sound particularly helpful yet, but stay with me. Suppose we could create a mask array, such that when the original 2D array is multiplied by the mask array only pixels values with a specified value were retained. For example, if we were interested in the value 5, we could do this.

IDL> Print, array * mask, FORMAT='(10I3)' 0 5 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 5 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 5 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 5 0 0 0 0 0 5 0 5 0 0 0 0 5 0 5

We could create just such a mask for doing something like this
by using the **Where **function like this.

s = Size(array, /DIMENSIONS) mask = BytArr(s[0], s[1]) indices = Where(array EQ 5, count) IF count GT 0 THEN mask[indices] = 1

In other words, find the indices in the 2D array where the value is 5. And set those indices in a mask array to the value 1. Leave all the other values in the mask array set to the value 0.

Now, if we create a 5-by-5 kernel, where the value of each element in the kernel is 1/5, then if we perform the multiplication of this 25-element array times the masked 2D array, and add the results (5 times 1/5 will be equal to 1, and 0 times 1/5 will be equal to 0), the total will be equal to the total number of pixels that have the value of 5. In the case of the box shown above, the answer is three.

So, with this general idea in mind, it is possible to write a
general purpose
function that returns the answer to us. Note that in the code below, we are going
to use the **EDGE_ZERO **keyword to the **CONVOL **function. This will treat "missing"
values in the 2D array as having a value of 0 in the calculations. (Such "missing" values
will occur anytime we get the sliding window within less than two pixels from an edge
of the 2D array.)

FUNCTION Small_Region_Values, array, value, KERNEL=kernel Compile_Opt idl2 On_Error, 2 ; Return to caller on error. ; Check parameters. IF N_Elements(array) EQ 0 THEN Message, 'Must pass 2D array.' IF N_Elements(value) EQ 0 THEN value = 1 IF N_Elements(kernel) EQ 0 THEN kernel = Replicate(1.0/value, 5, 5) ; Create a mask array. s = Size(array, /DIMENSIONS) mask = BytArr(s[0], s[1]) indices = Where(array EQ 5, count) IF count GT 0 THEN mask[indices] = 1 RETURN, Convol(Float(array) * mask, kernel, /EDGE_ZERO) END

Let's try it.

IDL> Print, Small_Region_Values(array, 5), FORMAT='(10I3)' 1 2 2 2 1 1 0 0 0 0 2 3 4 3 2 2 2 1 1 1 2 3 4 3 2 2 2 1 1 1 1 1 2 2 2 3 4 3 2 2 1 1 2 3 3 4 5 4 2 2 1 1 2 3 4 5 6 5 3 2 0 0 1 3 4 5 5 4 2 1 2 2 3 4 5 6 6 6 4 3 2 2 3 3 4 4 4 4 3 2 2 2 3 2 3 3 3 3 3 2

This looks good. Let's see the results side by side with the original 2D array.

Results at right, next to original 2D array. |

## Return All Values Together

What if you wanted to return the result array for all eight of the possible values?
In other words, what if you wanted to return a m-by-n-by-8 answer array, rather than a 2D
answer array. In this case, the **Where **function might be the slow step in the process.
It might be better to partition the 2D array with
**Value_Locate **and the **Histogram **commands, using the **Reverse_Indices **
keyword of **Histogram **to locate the pixels for the array mask.

Your code might look like this.

FUNCTION Small_Values, array Compile_Opt idl2 On_Error, 2 ; Return to caller on error. ; Check parameters. IF N_Elements(array) EQ 0 THEN Message, 'Must pass 2D array.' ; Create answer and mask arrays. s = Size(array, /DIMENSIONS) answers = LonArr(s[0], s[1], 8) mask = BytArr(s[0], s[1]) ; Partition the array values. cutoff = [1,2,3,4,5,6,7,8] h = Histogram(Value_Locate(cutoff, array) + 2, REVERSE_INDICES=ri) ; Find the answer arrays. FOR j=0,7 DO BEGIN mask = mask * 0B kernel = Replicate(1.0/(j+1), 5, 5) mask[ri[ri[j]:ri[j+1]-1]] = 1 answers[0,0,j] = Convol(Float(array) * mask, kernel, /EDGE_ZERO) ENDFOR RETURN, answers END

Let's check to be sure we get the same answer. Print the answer array for the value 5.

IDL> Print, (Small_Values(array))[*,*,4], FORMAT='(10I3)' 1 2 2 2 1 1 0 0 0 0 2 3 4 3 2 2 2 1 1 1 2 3 4 3 2 2 2 1 1 1 1 1 2 2 2 3 4 3 2 2 1 1 2 3 3 4 5 4 2 2 1 1 2 3 4 5 6 5 3 2 0 0 1 3 4 5 5 4 2 1 2 2 3 4 5 6 6 6 4 3 2 2 3 3 4 4 4 4 3 2 2 2 3 2 3 3 3 3 3 2

Yep. Exactly the same.

*Version of IDL used to prepare this article: IDL 7.0.3.*

Copyright © 2009 David W. Fanning

Last Updated 15 August 2009