# Examples To Illuminate the IDL Way

## Contents

- Replicate two-element array 1000 times.
- Create upper triangular matrix.
- Change values in array according to appearance order.
- Average arrays with missing data.
- Create median image from three images.
- Pixel color transformation applied to entire image.
- Determining the duration of events in a time series.
- Divide array equally based on sums.
- Zeroing columns of an array.
- Multiply a Vector by a Matrix.
- Increment an Array Subset by Another Array Subset

## Replicate a Two-element Array 1000 Times

**QUESTION:** Suppose I have an array, say *a = [2,3]*,
and i wish to repeat these two elements say 1000
times, so my final array is 2000 elements long with alternating 2s and 3s. Is there a quick way
to do this in IDL? At the
moment I am using a **FOR **loop and it is *extremely *slow.

** ANSWER:** You can find the answer to this question in a careful
reading the the Dimensional Juggling Tutorial.

a = [2,3] b = Reform(Rebin(a, 2, 1000), 2000) Print, b 2 3 2 3 2 3 2 3 2 3 ....

## Create an Upper Triangular Matrix

**QUESTION:** Is there an easy way to make an upper triangular
matrix (i.e. make all entries below the diagonal) zero ?

** ANSWER:** You need to multiply your array times a mask of the same
size in which all the values on the diagonal on above it are ones and all
the values below the diagonal are zeros. Here is how you would make such a
mask in IDL. Consider a 5-by-5 array:

IDL> n = 5 IDL> i = REBIN(LINDGEN(n), n, n) IDL> j = REBIN(TRANSPOSE(LINDGEN(n)), n, n) IDL> Print, i 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 IDL> Print, j 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 IDL> mask = (i GE j) IDL> Print, mask 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1

## Change Values in Array According to Appearance Order

**QUESTION:** I would like to change the values in an array according to the order
in which they appear in the array. For example, if I had this array:

[3,6,2,1,2,8,1,1]

I would like to obtain an array like this:

[1,2,3,4,3,6,4,4]

Here the value 3 is the first value in the array and is mapped to 1. The value 6 is the second value in the array and is mapped to 2, and do on. Multiple values are mapped to the value in which they first appear.

I guess this process is similar to a lookup table. But, since I would like to process large arrays ( of over 60000 elements), I'm looking for an efficient algorithm. I don't really know what to look for in IDL to find something which could help.

** ANSWER:** In situations like this, you always look to the
Histogram function. Here is a solution
that uses the reverse indices of the **Histogram **command to set the proper values
of the array you are looking for.

FUNCTION OccuranceOrder, array ; Error handling. On_Error, 2 IF N_Elements(array) EQ 0 THEN Message, 'Must pass an array of integer values.' ; Take histogram of array. Use REVERSE_INDICES to get index values. h = Histogram(array, Reverse_Indices=ri, Min=0) ; Create index array and output array. b = Indgen(N_Elements(h) > N_Elements(array)) + 1 c = Intarr(N_Elements(h) > N_Elements(array)) ; Find values in index array. Set MIN value. FOR j=0,N_Elements(h)-1 DO BEGIN IF ri[j+1] NE ri[j] THEN $ c[ri[ri[j]:ri[j+1]-1]] = Min(b[ri[ri[j]:ri[j+1]-1]]) ENDFOR ; Trim the array and return it. RETURN, c[0:N_Elements(array)-1] END

Call the program like this:

IDL> a = [3,6,2,1,2,8,1,1] IDL> Print, OccuranceOrder(a) 1 2 3 4 3 6 4 4

This solution brought objections from some folks on the newsgroup, who argued that with some data sets "rankings" could be skipped. (This is a reminder to people to check their algorithms with more than one data set!) For example, what about this:

IDL> a = [1,3,3,1,2,3,2,2,1,2,3] IDL> Print, Test(a) 1 2 2 1 5 2 5 5 1 5 2

There is no "ranking" of 3 or 4 in this output. What if the person who posed the original question did not want to skip rankings like this? (He didn't, as it turned out.)

Paolo Grigis proposed an alternative solution, based on this one. Here is his explanation:

[My solution uses] a loop, but this should not be too bad as long as you don't have more than a few million elements, provided that we just do simple operations in the loop.

The idea here is to use

Histogramto scout for the first occurrence of each integer number and store it inlistlowestind. We then sort the list, such that we know which element occurs first, and then with a second loop we fill the output array with the occurrences of each element. Since we have sorted the array first, we know which rank to assign to each element. This should be moderately efficient. I assumed that the array a has integer elements starting from 1, if not the case, just sum1-min(a)to the array before performing the operation.

Paolo's solution looks like this:

FUNCTION OccuranceOrder, array ; Error handling. On_Error, 2 IF N_Elements(array) EQ 0 THEN Message, 'Must pass an array of integer values.' ; Set up output variable c = LonArr(N_Elements(array)) ; Take histogram of array. Use REVERSE_INDICES to get index values. h = Histogram(array, Reverse_Indices=ri, Min=1L) listlowestind = Replicate(-1L, N_Elements(h)) ; Find the lowest index for each bin element. FOR i=0L,n_elements(h)-1 DO BEGIN IF ri[i] NE ri[i+1] THEN listlowestind[i]=ri[ri[i]] ENDFOR ; Some indexes are not used. Screen those out. indok=where(listlowestind GT -1L,countindok) indoffset=n_elements(listlowestind)-countindok ; Sort the list of indices. indsorted=(sort(listlowestind)) ; Do the assignment. FOR i=0L,countindok-1 DO BEGIN this_element=array[listlowestind[indsorted[i+indoffset]]] c[ri[ri[this_element-1]:ri[this_element]-1]]=i+1 ENDFOR RETURN, c END

Running Paolo's program:

IDL> a = [1,3,3,1,2,3,2,2,1,2,3] IDL> Print, OccuranceOrder(a) 1 2 2 1 3 2 3 3 1 3 2

This turned out to be what the original requester wanted, and he reported it was fast enough for his 60K arrays.

## Average Arrays with Missing Data

**QUESTION:** I have two arrays. Each array has floating point values and missing
data (*value=0*). I want to create a third array that has the average of
the two arrays if there are two good values. Otherwise, I want the
third array to take the value of the array that has good data.

The basic idea is this:

If A = 0, C = B If B = 0, C = A If A and B EQ 0, C = 0 If A and B NE 0, C = (A+B)/2

** ANSWER:** I provided the obvious solution:

C = (A + B) / 2 indices = WHERE(A EQ 0 AND B NE 0, count) IF count GT 0 THEN C[indices] = B[indices] indices = WHERE(A NE 0 AND B EQ 0, count) IF count GT 0 THEN C[indices] = A[indices]

But Craig Markwardt provided the IDL solution.

Here's is a solution without **WHERE**'s:

MISSING = 0.0 C = (A+B)/((A NE MISSING) + (B NE MISSING))

You'll get *NaN *whenever both values are missing. You can
convert these *NaN *values to 0 like this:

indices = Where(Finite(C, /NAN), count) IF count GT 0 THEN C[indices] = 0.0

Or, James Kuyper points out, you can avoid the problem like this:

MISSING = 0.0 C = (A+B)/((A NE MISSING) + (B NE MISSING) > 1.0)

This is easily
extendible to the case where you have *N* arrays with *M* values each.
Just arrange them into an *MxN *array,

DATA = DBLARR(M,N) ... fill data values ... C = TOTAL(DATA,2)/(TOTAL(DATA NE MISSING,2) > 1.0)

## Create a Median Image from Three Original Images

**QUESTION:** I have three images. I would like to create a fourth image with the median value
of the three images. Is there a way to do this without using a loop?

**ANSWER: ** Yes. First construct a three-dimensional image array from your three images.
Consult the Array Concatenation Tutorial if you are
uncertain what all these brackets mean.

imgarray = [ [[img1]], [[img2]], [[img3]] ]

Next, total the images in the third dimension.

totalImage = Total(imgarray, 3)

The median value is just the total image value minus the minimum value and the maximum value for the three images. The rest of the code looks like this:

minValue = Min(imgarray, Dimension=3) maxValue = Max(imgarray, Dimension=3) medianImage = totalImage - minValue - maxValue

## Apply a Pixel Transformation to an Entire Image

**QUESTION:** I have a color transformation that I want to apply to an RGB pixel in
the following way.

m = Findgen(3, 3) rgb = Bindgen(3) result = Transpose(m ## rbg)

The result now contains the new values in the same RGB format. My question is this. How do I execute this efficiently on a pixel interleaved (3,col,row) RGB image?

**ANSWER: ** The answer is provided in an IDL newsgroup (9 January 2006) article by Dick Jackson.

To do this with consistency when there is more than one pixel, you might want to flip things around. Consider this.

IDL> m = Findgen(3,3) IDL> rgb = Bindgen(3) IDL> Print, Transpose(m ## rgb) 5.00000 14.0000 23.0000 IDL> Print, Transpose(rgb ## m) 5.00000 14.0000 23.0000

You have the same result, but this formulation can be extended to a (3,n) array, as follows.

col = 40 row = 50 rgb = Bindgen(3, col, row)

Reform the RGB array into a (3,n) array.

rgb = Reform(rgb, 3, Long(col)*row, /Overwrite)

Now apply the transformation.

result = rgb ## Transpose(m)

Now, reform the arrays back to (3,col,row).

rgb = Reform(rbg, 3, col, row, /Overwrite) result = Reform(result, 3, col, row, /Overwrite)

There you go!

## Determining the Duration of Events in a Time Series

**QUESTION:** A task that occurs quite regularly in hydrology is to determine the
duration of events. For example, I might want to know the longest contiguous duration of
a stream flow above or below a certain discharge. Assuming I have an equidistant time series
(e.g., one value a day), this basically reduces to the question of how can I transform an
array like this:

series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]

into an array like this:

duration = [2,1,3,2]

Basically, I want to be able to count all the contiguous fields of "1"s in my array.

The way I am doing this now (using **CONVOL **and **WHERE**) is exceedingly tedious and slow. Do you
have any ideas for a cool way of doing this faster in IDL?

**ANSWER: ** The answer is provided in an IDL newsgroup (20 January 2006) thread,
entitled "Cool Way to Detemine Durations in Time Series", by Ben Tupper and
Dick Jackson.

First, Ben Tupper.

I think I would use a combination of

LABEL_REGIONandHISTOGRAM. The trick is to first buffer your series data with “background” values. Otherwise,LABEL_REGIONwill have trouble at the edges of your array. The code will look like this.IDL> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1] IDL> nSeries = N_Elements(series) IDL> buffered = [0, series, 0] IDL> labelled = Fix(Label_Region(buffered)) IDL> labelled = (Temporary(labelled))[1:nSeries] IDL> Print, labelled 1 1 0 0 0 0 2 0 3 3 3 0 0 4 4Then, run the labelled array through the

HISTOGRAMfunction, like this.IDL> duration = Histogram(labelled, MIN=1) IDL> Print, duration 2 1 3 2

Ben's slick method inspired Dick Jackson to think of another way to accomplish the same thing.

Well, I have to say Ben's method looks pretty cool, but there is another way to approach this. Starting with Ben's

bufferedarray, we can find the transitions from 0 to 1, or from 1 to 0, like this. First, find out where the transitions occur.IDL> whereChange = Where(buffered[1:*] NE buffered, nChange)Then, reform the

whereChangeindex array into a two column array.IDL> whereChange = Reform(whereChange, 2, nChange/2, /Overwrite) IDL> Print, whereChange 0 2 6 7 8 11 13 15If we measure the difference between the numbers in the right column and the numbers in the left column, we have the

durationarray.IDL> duration = Reform(whereChange[1,*] - whereChange[0,*]) IDL> Print, duration 2 1 3 2

Dick then ran some test with large arrays of random data to see which method was faster. Here are his results:

Number of Elements | Time (Ben's Method) | Time (Dick's Method) |

100,000 | 0.0620 sec | 0.0310 sec |

1,000,000 | 0.5160 sec | 0.3900 sec |

## Divide Array Equally Based on Sums

**QUESTION:** I would like a way to divide an
array of values into two bins, such that the sum of each bin is roughly
equal. The values have no fixed distribution, so I expect the bin sizes to
be non-uniform.

**ANSWER:** Probably **Where **will serve you well here:

IDL> t = Total(array, /CUMULATIVE) IDL> bin1 = Where(t LT t[n_elements(array)-1]/2, COMPLEMENT=bin2)

## Zeroing Columns of an Array

**QUESTION:** I was hoping that there was a nice way to do the following. I have a
4D array and I want to check if the fourth dimension contains a 0 in any
of it's values for each value of the other three dimensions, if it does I
want that whole fourth dimensional column set to 0.

This is how I'm doing it with a loop.

FOR i=0, 1 DO BEGIN FOR k = 0, 359 DO BEGIN FOR j = 0, 5 DO BEGIN test = where(array[i,j,k,*] eq 0) IF max(test) gt -1 THEN array[i,j,k,*] = 0 ENDFOR ENDFOR ENDFOR

This method is obviously horrible and slow. Any help/advice would be greatly appreciated.

**ANSWER:** Heinz Stege provides the answer to this question.

There is no need to use a loop here. Consider the following data array, which has the same size as your data set.

IDL> array = (RandomU(-3L, 2, 6, 360, 42) - 0.1) > 0

Since the column you are interested in is the last column, you can simply
reform the rest of the data into a single column along side the column you
are interested in. The *Overwrite *keyword is used just to avoid creating
a temporary array that is set to zeros. It makes the operation faster.

IDL> array = Reform(array, N_Elements(array)/42, 42, /Overwrite) IDL> Help, array ARRAY FLOAT = Array[4320, 42]

Now you simply find the zeros in the second column of the array, and if you find any, set that column to zero. When you are finished, you reconsitute the original array.

IDL> zeroIndices = Where( Min(array, DIMENSION=2) EQ 0.0), count) IDL> IF count GT 0 THEN array[zeroIndices, *] = 0.0 IDL> array = Reform(array, 2, 6, 360, 42, /Overwrite)

## Multiply a Vector by a Matrix

**QUESTION:** Let's say we have a vector A and matrix
array B so that:

A = [1, 2, 5, 2, 3] B = [[2, 4], [1, 9]]

I want to multiply each element of the vector A with the matrix array B but WITHOUT using a FOR loop. Is that possible? I need to do so in order to save execution time.

The output result should be an array C with dimensions C[2, 2, 5, 1].

**ANSWER:** Alain le Cacheux provides the answer.

The following bit of code will do it. To optimize the operation, set the *Over*
keyword to 1, but be aware that this will change the dimensions of B. If you set
the *Over *keyword to 0, the dimensions of B will be unchanged, but a temporary
copy of B will be created.

C = reform(reform(B, 4, OVER=over) # A, 2, 2, 5)

## Increment an Array Subset by Another Array Subset

**QUESTION:** I have a sorted list of IDs and the subsets they belong to.

ids = [11, 21, 45, 50, 56, 416, 430] sub = [A, A, B, A, B, B, A]

I want to increment the IDs of A based on the number of elements in subset B that lie in front of it. So for example , the 4th element will get incremented by 1 (50+1), since 1 element of B is in front. The last element will get incremented by 3 (430+3), since 3 elements of B lie in front of it. The first 2 elements of A don't get incremented.

I feel like the answer lies somewhere with Where, Total, or Histogram, but the solution escapes me.

**ANSWER:** Jeremy Bailin provides the solution, provided without comment.

increment = Total(sub EQ B, /Cumulative, /Integer) * (sub EQ A) ids += increment

Copyright © 2006 David W. Fanning

Written: 22 January 2006

Last Updated: 12 December 2012