Examples To Illuminate the IDL Way
Contents
- Replicate two-element array 1000 times.
- Create upper triangular matrix.
- Change values in array according to appearance order.
- Average arrays with missing data.
- Create median image from three images.
- Pixel color transformation applied to entire image.
- Determining the duration of events in a time series.
- Divide array equally based on sums.
- Zeroing columns of an array.
- Multiply a Vector by a Matrix.
- Increment an Array Subset by Another Array Subset
Replicate a Two-element Array 1000 Times
QUESTION: Suppose I have an array, say a = [2,3], and i wish to repeat these two elements say 1000 times, so my final array is 2000 elements long with alternating 2s and 3s. Is there a quick way to do this in IDL? At the moment I am using a FOR loop and it is extremely slow.
ANSWER: You can find the answer to this question in a careful reading the the Dimensional Juggling Tutorial.
a = [2,3] b = Reform(Rebin(a, 2, 1000), 2000) Print, b 2 3 2 3 2 3 2 3 2 3 ....
Create an Upper Triangular Matrix
QUESTION: Is there an easy way to make an upper triangular matrix (i.e. make all entries below the diagonal) zero ?
ANSWER: You need to multiply your array times a mask of the same size in which all the values on the diagonal on above it are ones and all the values below the diagonal are zeros. Here is how you would make such a mask in IDL. Consider a 5-by-5 array:
IDL> n = 5 IDL> i = REBIN(LINDGEN(n), n, n) IDL> j = REBIN(TRANSPOSE(LINDGEN(n)), n, n) IDL> Print, i 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 IDL> Print, j 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 IDL> mask = (i GE j) IDL> Print, mask 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1
Change Values in Array According to Appearance Order
QUESTION: I would like to change the values in an array according to the order in which they appear in the array. For example, if I had this array:
[3,6,2,1,2,8,1,1]
I would like to obtain an array like this:
[1,2,3,4,3,6,4,4]
Here the value 3 is the first value in the array and is mapped to 1. The value 6 is the second value in the array and is mapped to 2, and do on. Multiple values are mapped to the value in which they first appear.
I guess this process is similar to a lookup table. But, since I would like to process large arrays ( of over 60000 elements), I'm looking for an efficient algorithm. I don't really know what to look for in IDL to find something which could help.
ANSWER: In situations like this, you always look to the Histogram function. Here is a solution that uses the reverse indices of the Histogram command to set the proper values of the array you are looking for.
FUNCTION OccuranceOrder, array ; Error handling. On_Error, 2 IF N_Elements(array) EQ 0 THEN Message, 'Must pass an array of integer values.' ; Take histogram of array. Use REVERSE_INDICES to get index values. h = Histogram(array, Reverse_Indices=ri, Min=0) ; Create index array and output array. b = Indgen(N_Elements(h) > N_Elements(array)) + 1 c = Intarr(N_Elements(h) > N_Elements(array)) ; Find values in index array. Set MIN value. FOR j=0,N_Elements(h)-1 DO BEGIN IF ri[j+1] NE ri[j] THEN $ c[ri[ri[j]:ri[j+1]-1]] = Min(b[ri[ri[j]:ri[j+1]-1]]) ENDFOR ; Trim the array and return it. RETURN, c[0:N_Elements(array)-1] END
Call the program like this:
IDL> a = [3,6,2,1,2,8,1,1] IDL> Print, OccuranceOrder(a) 1 2 3 4 3 6 4 4
This solution brought objections from some folks on the newsgroup, who argued that with some data sets "rankings" could be skipped. (This is a reminder to people to check their algorithms with more than one data set!) For example, what about this:
IDL> a = [1,3,3,1,2,3,2,2,1,2,3] IDL> Print, Test(a) 1 2 2 1 5 2 5 5 1 5 2
There is no "ranking" of 3 or 4 in this output. What if the person who posed the original question did not want to skip rankings like this? (He didn't, as it turned out.)
Paolo Grigis proposed an alternative solution, based on this one. Here is his explanation:
[My solution uses] a loop, but this should not be too bad as long as you don't have more than a few million elements, provided that we just do simple operations in the loop.
The idea here is to use Histogram to scout for the first occurrence of each integer number and store it in listlowestind. We then sort the list, such that we know which element occurs first, and then with a second loop we fill the output array with the occurrences of each element. Since we have sorted the array first, we know which rank to assign to each element. This should be moderately efficient. I assumed that the array a has integer elements starting from 1, if not the case, just sum 1-min(a) to the array before performing the operation.
Paolo's solution looks like this:
FUNCTION OccuranceOrder, array ; Error handling. On_Error, 2 IF N_Elements(array) EQ 0 THEN Message, 'Must pass an array of integer values.' ; Set up output variable c = LonArr(N_Elements(array)) ; Take histogram of array. Use REVERSE_INDICES to get index values. h = Histogram(array, Reverse_Indices=ri, Min=1L) listlowestind = Replicate(-1L, N_Elements(h)) ; Find the lowest index for each bin element. FOR i=0L,n_elements(h)-1 DO BEGIN IF ri[i] NE ri[i+1] THEN listlowestind[i]=ri[ri[i]] ENDFOR ; Some indexes are not used. Screen those out. indok=where(listlowestind GT -1L,countindok) indoffset=n_elements(listlowestind)-countindok ; Sort the list of indices. indsorted=(sort(listlowestind)) ; Do the assignment. FOR i=0L,countindok-1 DO BEGIN this_element=array[listlowestind[indsorted[i+indoffset]]] c[ri[ri[this_element-1]:ri[this_element]-1]]=i+1 ENDFOR RETURN, c END
Running Paolo's program:
IDL> a = [1,3,3,1,2,3,2,2,1,2,3] IDL> Print, OccuranceOrder(a) 1 2 2 1 3 2 3 3 1 3 2
This turned out to be what the original requester wanted, and he reported it was fast enough for his 60K arrays.
Average Arrays with Missing Data
QUESTION: I have two arrays. Each array has floating point values and missing data (value=0). I want to create a third array that has the average of the two arrays if there are two good values. Otherwise, I want the third array to take the value of the array that has good data.
The basic idea is this:
If A = 0, C = B If B = 0, C = A If A and B EQ 0, C = 0 If A and B NE 0, C = (A+B)/2
ANSWER: I provided the obvious solution:
C = (A + B) / 2 indices = WHERE(A EQ 0 AND B NE 0, count) IF count GT 0 THEN C[indices] = B[indices] indices = WHERE(A NE 0 AND B EQ 0, count) IF count GT 0 THEN C[indices] = A[indices]
But Craig Markwardt provided the IDL solution.
Here's is a solution without WHERE's:
MISSING = 0.0 C = (A+B)/((A NE MISSING) + (B NE MISSING))
You'll get NaN whenever both values are missing. You can convert these NaN values to 0 like this:
indices = Where(Finite(C, /NAN), count) IF count GT 0 THEN C[indices] = 0.0
Or, James Kuyper points out, you can avoid the problem like this:
MISSING = 0.0 C = (A+B)/((A NE MISSING) + (B NE MISSING) > 1.0)
This is easily extendible to the case where you have N arrays with M values each. Just arrange them into an MxN array,
DATA = DBLARR(M,N) ... fill data values ... C = TOTAL(DATA,2)/(TOTAL(DATA NE MISSING,2) > 1.0)
Create a Median Image from Three Original Images
QUESTION: I have three images. I would like to create a fourth image with the median value of the three images. Is there a way to do this without using a loop?
ANSWER: Yes. First construct a three-dimensional image array from your three images. Consult the Array Concatenation Tutorial if you are uncertain what all these brackets mean.
imgarray = [ [[img1]], [[img2]], [[img3]] ]
Next, total the images in the third dimension.
totalImage = Total(imgarray, 3)
The median value is just the total image value minus the minimum value and the maximum value for the three images. The rest of the code looks like this:
minValue = Min(imgarray, Dimension=3) maxValue = Max(imgarray, Dimension=3) medianImage = totalImage - minValue - maxValue
Apply a Pixel Transformation to an Entire Image
QUESTION: I have a color transformation that I want to apply to an RGB pixel in the following way.
m = Findgen(3, 3) rgb = Bindgen(3) result = Transpose(m ## rbg)
The result now contains the new values in the same RGB format. My question is this. How do I execute this efficiently on a pixel interleaved (3,col,row) RGB image?
ANSWER: The answer is provided in an IDL newsgroup (9 January 2006) article by Dick Jackson.
To do this with consistency when there is more than one pixel, you might want to flip things around. Consider this.
IDL> m = Findgen(3,3) IDL> rgb = Bindgen(3) IDL> Print, Transpose(m ## rgb) 5.00000 14.0000 23.0000 IDL> Print, Transpose(rgb ## m) 5.00000 14.0000 23.0000
You have the same result, but this formulation can be extended to a (3,n) array, as follows.
col = 40 row = 50 rgb = Bindgen(3, col, row)
Reform the RGB array into a (3,n) array.
rgb = Reform(rgb, 3, Long(col)*row, /Overwrite)
Now apply the transformation.
result = rgb ## Transpose(m)
Now, reform the arrays back to (3,col,row).
rgb = Reform(rbg, 3, col, row, /Overwrite) result = Reform(result, 3, col, row, /Overwrite)
There you go!
Determining the Duration of Events in a Time Series
QUESTION: A task that occurs quite regularly in hydrology is to determine the duration of events. For example, I might want to know the longest contiguous duration of a stream flow above or below a certain discharge. Assuming I have an equidistant time series (e.g., one value a day), this basically reduces to the question of how can I transform an array like this:
series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
into an array like this:
duration = [2,1,3,2]
Basically, I want to be able to count all the contiguous fields of "1"s in my array.
The way I am doing this now (using CONVOL and WHERE) is exceedingly tedious and slow. Do you have any ideas for a cool way of doing this faster in IDL?
ANSWER: The answer is provided in an IDL newsgroup (20 January 2006) thread, entitled "Cool Way to Detemine Durations in Time Series", by Ben Tupper and Dick Jackson.
First, Ben Tupper.
I think I would use a combination of LABEL_REGION and HISTOGRAM. The trick is to first buffer your series data with “background” values. Otherwise, LABEL_REGION will have trouble at the edges of your array. The code will look like this.
IDL> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1] IDL> nSeries = N_Elements(series) IDL> buffered = [0, series, 0] IDL> labelled = Fix(Label_Region(buffered)) IDL> labelled = (Temporary(labelled))[1:nSeries] IDL> Print, labelled 1 1 0 0 0 0 2 0 3 3 3 0 0 4 4Then, run the labelled array through the HISTOGRAM function, like this.
IDL> duration = Histogram(labelled, MIN=1) IDL> Print, duration 2 1 3 2
Ben's slick method inspired Dick Jackson to think of another way to accomplish the same thing.
Well, I have to say Ben's method looks pretty cool, but there is another way to approach this. Starting with Ben's buffered array, we can find the transitions from 0 to 1, or from 1 to 0, like this. First, find out where the transitions occur.
IDL> whereChange = Where(buffered[1:*] NE buffered, nChange)Then, reform the whereChange index array into a two column array.
IDL> whereChange = Reform(whereChange, 2, nChange/2, /Overwrite) IDL> Print, whereChange 0 2 6 7 8 11 13 15If we measure the difference between the numbers in the right column and the numbers in the left column, we have the duration array.
IDL> duration = Reform(whereChange[1,*] - whereChange[0,*]) IDL> Print, duration 2 1 3 2
Dick then ran some test with large arrays of random data to see which method was faster. Here are his results:
Number of Elements | Time (Ben's Method) | Time (Dick's Method) |
100,000 | 0.0620 sec | 0.0310 sec | 1,000,000 | 0.5160 sec | 0.3900 sec |
Divide Array Equally Based on Sums
QUESTION: I would like a way to divide an array of values into two bins, such that the sum of each bin is roughly equal. The values have no fixed distribution, so I expect the bin sizes to be non-uniform.
ANSWER: Probably Where will serve you well here:
IDL> t = Total(array, /CUMULATIVE) IDL> bin1 = Where(t LT t[n_elements(array)-1]/2, COMPLEMENT=bin2)
Zeroing Columns of an Array
QUESTION: I was hoping that there was a nice way to do the following. I have a 4D array and I want to check if the fourth dimension contains a 0 in any of it's values for each value of the other three dimensions, if it does I want that whole fourth dimensional column set to 0.
This is how I'm doing it with a loop.
FOR i=0, 1 DO BEGIN FOR k = 0, 359 DO BEGIN FOR j = 0, 5 DO BEGIN test = where(array[i,j,k,*] eq 0) IF max(test) gt -1 THEN array[i,j,k,*] = 0 ENDFOR ENDFOR ENDFOR
This method is obviously horrible and slow. Any help/advice would be greatly appreciated.
ANSWER: Heinz Stege provides the answer to this question.
There is no need to use a loop here. Consider the following data array, which has the same size as your data set.
IDL> array = (RandomU(-3L, 2, 6, 360, 42) - 0.1) > 0
Since the column you are interested in is the last column, you can simply reform the rest of the data into a single column along side the column you are interested in. The Overwrite keyword is used just to avoid creating a temporary array that is set to zeros. It makes the operation faster.
IDL> array = Reform(array, N_Elements(array)/42, 42, /Overwrite) IDL> Help, array ARRAY FLOAT = Array[4320, 42]
Now you simply find the zeros in the second column of the array, and if you find any, set that column to zero. When you are finished, you reconsitute the original array.
IDL> zeroIndices = Where( Min(array, DIMENSION=2) EQ 0.0), count) IDL> IF count GT 0 THEN array[zeroIndices, *] = 0.0 IDL> array = Reform(array, 2, 6, 360, 42, /Overwrite)
Multiply a Vector by a Matrix
QUESTION: Let's say we have a vector A and matrix array B so that:
A = [1, 2, 5, 2, 3] B = [[2, 4], [1, 9]]
I want to multiply each element of the vector A with the matrix array B but WITHOUT using a FOR loop. Is that possible? I need to do so in order to save execution time.
The output result should be an array C with dimensions C[2, 2, 5, 1].
ANSWER: Alain le Cacheux provides the answer.
The following bit of code will do it. To optimize the operation, set the Over keyword to 1, but be aware that this will change the dimensions of B. If you set the Over keyword to 0, the dimensions of B will be unchanged, but a temporary copy of B will be created.
C = reform(reform(B, 4, OVER=over) # A, 2, 2, 5)
Increment an Array Subset by Another Array Subset
QUESTION: I have a sorted list of IDs and the subsets they belong to.
ids = [11, 21, 45, 50, 56, 416, 430] sub = [A, A, B, A, B, B, A]
I want to increment the IDs of A based on the number of elements in subset B that lie in front of it. So for example , the 4th element will get incremented by 1 (50+1), since 1 element of B is in front. The last element will get incremented by 3 (430+3), since 3 elements of B lie in front of it. The first 2 elements of A don't get incremented.
I feel like the answer lies somewhere with Where, Total, or Histogram, but the solution escapes me.
ANSWER: Jeremy Bailin provides the solution, provided without comment.
increment = Total(sub EQ B, /Cumulative, /Integer) * (sub EQ A) ids += increment
Copyright © 2006 David W. Fanning
Written: 22 January 2006
Last Updated: 12 December 2012