Simple question in IDL, looking for solution, thank you [message #81769] |
Mon, 22 October 2012 00:07  |
Danxia
Messages: 10 Registered: September 2012
|
Junior Member |
|
|
Dear all, please see my questions as described below and let me know if you have any solution.
For example, my array [arr] =
5 2 3
1 8 3
1 2 3
There's another array [bcg], which is the occurrence times of each unique element in array [arr]
[bcg] =
1 2 3
2 1 4
2 3 5
How can I get the total occurrence frequencies of sorted elements in [arr] as indicated in [bcg], like:(2+2) (2+3) (3+4+5) (0) (1) (0) (0) (1)
which is equal to 4 5 12 0 1 0 0 1, meaning 4 times of 1, 5 times of 2, 12 times of 3, 0 times of 4, 1 times of 5, 0 of 6, 0 of 7 and 1 of 8.
I appreciate your any replied. Thanks.
Danxia
|
|
|
Re: Simple question in IDL, looking for solution, thank you [message #81842 is a reply to message #81769] |
Tue, 23 October 2012 15:21  |
Heinz Stege
Messages: 189 Registered: January 2003
|
Senior Member |
|
|
On Tue, 23 Oct 2012 15:57:20 -0400, Jeremy Bailin wrote:
> A couple of notes:
>
> JBIU has a weighted histogram function:
> http://astroconst.org/jbiu/jbiu-doc/math/histogram_weight.ht ml
>
This is really great. I have learned something new again. Thank you,
Jeremy.
For the documentation: Jeremy's way of "chunk indexing" goes the
following way:
h=histogram(arr,min=0,reverse_indices=ri)
sum=lonarr(size(h,/dimensions))
for i=0l,n_elements(h)-1 do $
if h[i] gt 0 then sum[i]=total(bcg[ri[ri[i]:ri[i+1]-1]],/integer)
print,sum[1:*]
Small code, very fast, and low memory consumption. This is perfect.
Cheers, Heinz
> Regarding reverse_indices using lots of memory on sparse histograms: use
> VALUE_LOCATE!
> http://www.idlcoyote.com/code_tips/valuelocate.html
>
> -Jeremy.
|
|
|
Re: Simple question in IDL, looking for solution, thank you [message #81843 is a reply to message #81769] |
Tue, 23 October 2012 15:16  |
Heinz Stege
Messages: 189 Registered: January 2003
|
Senior Member |
|
|
On Tue, 23 Oct 2012 20:07:48 +0200, I wrote:
[...]
> print,histogram(arr,weight=bcg,/integer,min=1)
>
> This would be nice. By the way, when I type the line above, IDL
> (Version 8.0.1) says:
>
> % Keyword INTEGER not allowed in call to: HISTOGRAM
> % Error occurred at: $MAIN$
> % Execution halted at: $MAIN$
>
> No integer keyword allowed in the histogram function? Strange! ;-)
>
The thing above were absolute nonsense. (I am happy not to be an
operator in a nuclear power station.) Please read:
print,histogram(arr,weight=bcg,min=1)
Point. (Notice that such a thing has been written in the IDL language
by JBIU, see the previous posting.)
Heinz
|
|
|
Re: Simple question in IDL, looking for solution, thank you [message #81846 is a reply to message #81769] |
Tue, 23 October 2012 12:57  |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On 10/23/12 2:07 PM, Heinz Stege wrote:
> On Mon, 22 Oct 2012 14:10:18 -0400, Jeremy Bailin wrote:
>
>> On 10/22/12 7:55 AM, Heinz Stege wrote:
>>> Hi Danxia,
>>>
>>> you didn't ask for a solution without a loop. So here is my simple
>>> answer:
>>>
>>> arr=[5,2,3,1,8,3,1,2,3]
>>> bcg=[1,2,3,2,1,4,2,3,5]
>>> sum=intarr(max(arr)+1)
>>> for i=0,n_elements(bcg)-1 do sum[arr[i]]+=bcg[i]
>>> print,sum[1:*]
>>>
>>> Cheers, Heinz
>>>
>>
>> And of course, if you need a very efficient implementation of this (i.e.
>> if your arrays have millions of elements), then read the "chunk
>> indexing" section of JD's HISTOGRAM tutorial
>> http://www.idlcoyote.com/tips/histogram_tutorial.html (you HAVE read
>> JD's HISTOGRAM tutorial, right???)
>>
>> -Jeremy.
>
>
> Hi Jeremy,
>
> I suppose you mean something like the following:
>
> h=histogram(total(bcg,/cumulative,/integer)-1,/binsize,min=0 ,reverse_indices=ri)
> i=ri[0:n_elements(h)-1]-ri[0]
> print,histogram(arr[i],min=1)
>
> The histogram methods in general are very smart. The above code is
> significantly faster than my, which contains the loop. However, from
> my point of view, this is not a good solution.
>
> In case of very many elements within arr (and bcg) and/or big numbers
> within bcg the reverse indices array ri gets very large. The size of
> ri is always greater than total(bcg). IDL may run out of memory.
>
> So I would say, the loop may compete with the reverse indices.
>
> When I wrote "simple answer", I had in mind that there must be another
> solution. One without a loop. It is more the "IDL-style". But it is a
> little bit more complex:
>
> ii=sort(arr)
> sarr=arr[ii]
> tot=total(bcg[ii],/cumulative,/integer)
> ;
> ii=where(sarr ne shift(sarr,-1),count)
> if count eq 0 then ii=[n_elements(sarr)-1]
> tot=tot[ii]
> if count ge 2 then tot[1:*]-=tot
> ;
> sum=lonarr(sarr[n_elements(sarr)-1]+1)
> sum[sarr[ii]]=tot
> ;
> print,sum[1:*]
>
> This code has a moderate memory consumption and seems to be a true
> alternative to both, the loop-method and the reverse-indices-method.
>
> A word to the developers of IDL: What about a WEIGHT keyword in the
> histogram function?
>
> print,histogram(arr,weight=bcg,/integer,min=1)
>
> This would be nice. By the way, when I type the line above, IDL
> (Version 8.0.1) says:
>
> % Keyword INTEGER not allowed in call to: HISTOGRAM
> % Error occurred at: $MAIN$
> % Execution halted at: $MAIN$
>
> No integer keyword allowed in the histogram function? Strange! ;-)
>
> Cheers, Heinz
>
A couple of notes:
JBIU has a weighted histogram function:
http://astroconst.org/jbiu/jbiu-doc/math/histogram_weight.ht ml
Regarding reverse_indices using lots of memory on sparse histograms: use
VALUE_LOCATE!
http://www.idlcoyote.com/code_tips/valuelocate.html
-Jeremy.
|
|
|