Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66280] |
Wed, 06 May 2009 04:52 |
vino
Messages: 36 Registered: March 2008
|
Member |
|
|
Hello Jeremy,
Thanks for your idea here... just when i was thinking how to do this,
i found this post....
Thanks to the OP as well!!
Regards,
Vino
On May 4, 7:15 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
> On May 4, 11:12 am, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>> Jeremy,
>
>>> Thanks for your kind and prompt help!
>>> It took my own routine 18 hours to do the job. I have just plug the
>>> codes you kindly offered into my codes, I'll let you know how
>>> efficient your routine is. Thanks!
>
>>> Bo
>
>> Hi Jeremy,
>
>> Your code helps me save 7 hours! That's a lot. Thanks!
>
>> Bo
>
> No problem! Glad it helped.
>
> -Jeremy.
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66298 is a reply to message #66280] |
Mon, 04 May 2009 11:15  |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On May 4, 11:12 am, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>> Jeremy,
>
>> Thanks for your kind and prompt help!
>> It took my own routine 18 hours to do the job. I have just plug the
>> codes you kindly offered into my codes, I'll let you know how
>> efficient your routine is. Thanks!
>
>> Bo
>
> Hi Jeremy,
>
> Your code helps me save 7 hours! That's a lot. Thanks!
>
> Bo
No problem! Glad it helped.
-Jeremy.
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66300 is a reply to message #66298] |
Mon, 04 May 2009 08:12  |
chenbo09@gmail.com
Messages: 15 Registered: May 2009
|
Junior Member |
|
|
On May 3, 12:47 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
> On May 2, 7:57 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>
>
>> On May 2, 8:47 pm, guillermo.castilla.castell...@gmail.com wrote:
>
>>> On May 1, 12:36 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>>> On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>>> > On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>>> > > Hello, everyone!
>
>>>> > > Is there anyone knows a routine in IDL that be capable to remove
>>>> > > duplicate elements from a multi-dimensional array efficiently? I 'm
>>>> > > now working with huge arrays, and I have written one by myself, it
>>>> > > works but is with low efficiency.
>
>>>> > > example of my problem:
>>>> > > the input array:
>>>> > > 1,10,9,100,200
>>>> > > 2,11,8,101,201
>>>> > > 2,11,8,101,201
>>>> > > 3,10,9,100,200
>>>> > > 4,7,12,99,199
>>>> > > 2,11,8,101,201
>
>>>> > > goal:
>>>> > > remove the duplicate elements with the same values for the second and
>>>> > > the third column.
>
>>>> > > expected output:
>>>> > > 1,10,9,100,200
>>>> > > 2,11,8,101,201
>>>> > > 4,7,12,99,199
>
>>>> > > Thanks for your help!
>
>>>> > > Bo
>
>>> If you don't have handy that ORD function Jeremy pointed out (I didn't
>>> know of it), and assuming your array is of byte type, you can do the
>>> following:
>
>>> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
>>> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
>>> keep = Where(Histogram(1000L*input[1,*]+input[2,*], rev=r) GT 0)
>>> keep = r[r[keep]]
>>> print, input[*,keep[sort(keep)]]
>>> 1 10 9 100 200
>>> 2 11 8 101 201
>>> 4 7 12 99 199
>
>>> Cheers
>
>>> Guillermo
>
>> You can find ord at:
>
>> http://web.astroconst.org/jbiu/jbiu-doc/math/ord.html
>
>> -Jeremy.
>
> Jeremy,
>
> Thanks for your kind and prompt help!
> It took my own routine 18 hours to do the job. I have just plug the
> codes you kindly offered into my codes, I'll let you know how
> efficient your routine is. Thanks!
>
> Bo
Hi Jeremy,
Your code helps me save 7 hours! That's a lot. Thanks!
Bo
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66311 is a reply to message #66300] |
Sun, 03 May 2009 10:54  |
chenbo09@gmail.com
Messages: 15 Registered: May 2009
|
Junior Member |
|
|
On May 2, 7:47 pm, guillermo.castilla.castell...@gmail.com wrote:
> On May 1, 12:36 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>
>
>> On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>>> Hello, everyone!
>
>>>> Is there anyone knows a routine in IDL that be capable to remove
>>>> duplicate elements from a multi-dimensional array efficiently? I 'm
>>>> now working with huge arrays, and I have written one by myself, it
>>>> works but is with low efficiency.
>
>>>> example of my problem:
>>>> the input array:
>>>> 1,10,9,100,200
>>>> 2,11,8,101,201
>>>> 2,11,8,101,201
>>>> 3,10,9,100,200
>>>> 4,7,12,99,199
>>>> 2,11,8,101,201
>
>>>> goal:
>>>> remove the duplicate elements with the same values for the second and
>>>> the third column.
>
>>>> expected output:
>>>> 1,10,9,100,200
>>>> 2,11,8,101,201
>>>> 4,7,12,99,199
>
>>>> Thanks for your help!
>
>>>> Bo
>
> If you don't have handy that ORD function Jeremy pointed out (I didn't
> know of it), and assuming your array is of byte type, you can do the
> following:
>
> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
> keep = Where(Histogram(1000L*input[1,*]+input[2,*], rev=r) GT 0)
> keep = r[r[keep]]
> print, input[*,keep[sort(keep)]]
> 1 10 9 100 200
> 2 11 8 101 201
> 4 7 12 99 199
>
> Cheers
>
> Guillermo
Hi Guillermo,
Thanks for your kind suggestion! Have a nice weekend!
Bo
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66312 is a reply to message #66311] |
Sun, 03 May 2009 10:50  |
chenbo09@gmail.com
Messages: 15 Registered: May 2009
|
Junior Member |
|
|
On May 2, 7:47 pm, guillermo.castilla.castell...@gmail.com wrote:
> On May 1, 12:36 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>
>
>> On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>>> Hello, everyone!
>
>>>> Is there anyone knows a routine in IDL that be capable to remove
>>>> duplicate elements from a multi-dimensional array efficiently? I 'm
>>>> now working with huge arrays, and I have written one by myself, it
>>>> works but is with low efficiency.
>
>>>> example of my problem:
>>>> the input array:
>>>> 1,10,9,100,200
>>>> 2,11,8,101,201
>>>> 2,11,8,101,201
>>>> 3,10,9,100,200
>>>> 4,7,12,99,199
>>>> 2,11,8,101,201
>
>>>> goal:
>>>> remove the duplicate elements with the same values for the second and
>>>> the third column.
>
>>>> expected output:
>>>> 1,10,9,100,200
>>>> 2,11,8,101,201
>>>> 4,7,12,99,199
>
>>>> Thanks for your help!
>
>>>> Bo
>
> If you don't have handy that ORD function Jeremy pointed out (I didn't
> know of it), and assuming your array is of byte type, you can do the
> following:
>
> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
> keep = Where(Histogram(1000L*input[1,*]+input[2,*], rev=r) GT 0)
> keep = r[r[keep]]
> print, input[*,keep[sort(keep)]]
> 1 10 9 100 200
> 2 11 8 101 201
> 4 7 12 99 199
>
> Cheers
>
> Guillermo
Hi Guillermo,
Thanks for your suggestion! Have a nice weekend!
Bo
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66313 is a reply to message #66312] |
Sun, 03 May 2009 10:47  |
chenbo09@gmail.com
Messages: 15 Registered: May 2009
|
Junior Member |
|
|
On May 2, 7:57 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
> On May 2, 8:47 pm, guillermo.castilla.castell...@gmail.com wrote:
>
>
>
>> On May 1, 12:36 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>> On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>>> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>>> > Hello, everyone!
>
>>>> > Is there anyone knows a routine in IDL that be capable to remove
>>>> > duplicate elements from a multi-dimensional array efficiently? I 'm
>>>> > now working with huge arrays, and I have written one by myself, it
>>>> > works but is with low efficiency.
>
>>>> > example of my problem:
>>>> > the input array:
>>>> > 1,10,9,100,200
>>>> > 2,11,8,101,201
>>>> > 2,11,8,101,201
>>>> > 3,10,9,100,200
>>>> > 4,7,12,99,199
>>>> > 2,11,8,101,201
>
>>>> > goal:
>>>> > remove the duplicate elements with the same values for the second and
>>>> > the third column.
>
>>>> > expected output:
>>>> > 1,10,9,100,200
>>>> > 2,11,8,101,201
>>>> > 4,7,12,99,199
>
>>>> > Thanks for your help!
>
>>>> > Bo
>
>> If you don't have handy that ORD function Jeremy pointed out (I didn't
>> know of it), and assuming your array is of byte type, you can do the
>> following:
>
>> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
>> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
>> keep = Where(Histogram(1000L*input[1,*]+input[2,*], rev=r) GT 0)
>> keep = r[r[keep]]
>> print, input[*,keep[sort(keep)]]
>> 1 10 9 100 200
>> 2 11 8 101 201
>> 4 7 12 99 199
>
>> Cheers
>
>> Guillermo
>
> You can find ord at:
>
> http://web.astroconst.org/jbiu/jbiu-doc/math/ord.html
>
> -Jeremy.
Jeremy,
Thanks for your kind and prompt help!
It took my own routine 18 hours to do the job. I have just plug the
codes you kindly offered into my codes, I'll let you know how
efficient your routine is. Thanks!
Bo
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66318 is a reply to message #66313] |
Sat, 02 May 2009 17:57  |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On May 2, 8:47 pm, guillermo.castilla.castell...@gmail.com wrote:
> On May 1, 12:36 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>
>
>> On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>>> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>>> Hello, everyone!
>
>>>> Is there anyone knows a routine in IDL that be capable to remove
>>>> duplicate elements from a multi-dimensional array efficiently? I 'm
>>>> now working with huge arrays, and I have written one by myself, it
>>>> works but is with low efficiency.
>
>>>> example of my problem:
>>>> the input array:
>>>> 1,10,9,100,200
>>>> 2,11,8,101,201
>>>> 2,11,8,101,201
>>>> 3,10,9,100,200
>>>> 4,7,12,99,199
>>>> 2,11,8,101,201
>
>>>> goal:
>>>> remove the duplicate elements with the same values for the second and
>>>> the third column.
>
>>>> expected output:
>>>> 1,10,9,100,200
>>>> 2,11,8,101,201
>>>> 4,7,12,99,199
>
>>>> Thanks for your help!
>
>>>> Bo
>
> If you don't have handy that ORD function Jeremy pointed out (I didn't
> know of it), and assuming your array is of byte type, you can do the
> following:
>
> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
> keep = Where(Histogram(1000L*input[1,*]+input[2,*], rev=r) GT 0)
> keep = r[r[keep]]
> print, input[*,keep[sort(keep)]]
> 1 10 9 100 200
> 2 11 8 101 201
> 4 7 12 99 199
>
> Cheers
>
> Guillermo
You can find ord at:
http://web.astroconst.org/jbiu/jbiu-doc/math/ord.html
-Jeremy.
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66319 is a reply to message #66318] |
Sat, 02 May 2009 17:47  |
guillermo.castilla.ca
Messages: 27 Registered: September 2008
|
Junior Member |
|
|
On May 1, 12:36 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
> On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
>
>
>
>> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>>> Hello, everyone!
>
>>> Is there anyone knows a routine in IDL that be capable to remove
>>> duplicate elements from a multi-dimensional array efficiently? I 'm
>>> now working with huge arrays, and I have written one by myself, it
>>> works but is with low efficiency.
>
>>> example of my problem:
>>> the input array:
>>> 1,10,9,100,200
>>> 2,11,8,101,201
>>> 2,11,8,101,201
>>> 3,10,9,100,200
>>> 4,7,12,99,199
>>> 2,11,8,101,201
>
>>> goal:
>>> remove the duplicate elements with the same values for the second and
>>> the third column.
>
>>> expected output:
>>> 1,10,9,100,200
>>> 2,11,8,101,201
>>> 4,7,12,99,199
>
>>> Thanks for your help!
>
>>> Bo
>
If you don't have handy that ORD function Jeremy pointed out (I didn't
know of it), and assuming your array is of byte type, you can do the
following:
input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
[3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
keep = Where(Histogram(1000L*input[1,*]+input[2,*], rev=r) GT 0)
keep = r[r[keep]]
print, input[*,keep[sort(keep)]]
1 10 9 100 200
2 11 8 101 201
4 7 12 99 199
Cheers
Guillermo
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66329 is a reply to message #66319] |
Fri, 01 May 2009 11:36  |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>
>
>> Hello, everyone!
>
>> Is there anyone knows a routine in IDL that be capable to remove
>> duplicate elements from a multi-dimensional array efficiently? I 'm
>> now working with huge arrays, and I have written one by myself, it
>> works but is with low efficiency.
>
>> example of my problem:
>> the input array:
>> 1,10,9,100,200
>> 2,11,8,101,201
>> 2,11,8,101,201
>> 3,10,9,100,200
>> 4,7,12,99,199
>> 2,11,8,101,201
>
>> goal:
>> remove the duplicate elements with the same values for the second and
>> the third column.
>
>> expected output:
>> 1,10,9,100,200
>> 2,11,8,101,201
>> 4,7,12,99,199
>
>> Thanks for your help!
>
>> Bo
>
> How's this:
>
> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
> ; Step 1: Map your columns 2 and 3 into a single unique index
> (requires ORD from JBIU):
> col2ord = ord(input[1,*])
> col3ord = ord(input[2,*])
> index = col2ord + (max(col2ord)+1)*col3ord
>
> ; Step 2: Use histogram to find which ones have the same unique index
> h = histogram(index, reverse_indices=ri)
>
> ; Step 3: Get the first one in each bin, and put back in sorted order
> keep = ri[ri[where(h gt 0)]]
> keep = keep[sort(keep)]
>
> ; Step 4: Print them out:
> print, input[*,keep]
>
> 1 10 9 100 200
> 2 11 8 101 201
> 4 7 12 99 199
>
> -Jeremy.
Incidentally, if you're dealing with huge arrays and run into memory
problems with the histogram, you can replace:
index = col2ord + (max(col2ord)+1)*col3ord
with
index = ord(col2ord + (max(col2ord)+1)*col3ord)
which will make the histogram as compact as possible.
-Jeremy.
|
|
|
Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66330 is a reply to message #66329] |
Fri, 01 May 2009 10:47  |
Jeremy Bailin
Messages: 618 Registered: April 2008
|
Senior Member |
|
|
On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
> Hello, everyone!
>
> Is there anyone knows a routine in IDL that be capable to remove
> duplicate elements from a multi-dimensional array efficiently? I 'm
> now working with huge arrays, and I have written one by myself, it
> works but is with low efficiency.
>
> example of my problem:
> the input array:
> 1,10,9,100,200
> 2,11,8,101,201
> 2,11,8,101,201
> 3,10,9,100,200
> 4,7,12,99,199
> 2,11,8,101,201
>
> goal:
> remove the duplicate elements with the same values for the second and
> the third column.
>
> expected output:
> 1,10,9,100,200
> 2,11,8,101,201
> 4,7,12,99,199
>
> Thanks for your help!
>
> Bo
How's this:
input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
[3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
; Step 1: Map your columns 2 and 3 into a single unique index
(requires ORD from JBIU):
col2ord = ord(input[1,*])
col3ord = ord(input[2,*])
index = col2ord + (max(col2ord)+1)*col3ord
; Step 2: Use histogram to find which ones have the same unique index
h = histogram(index, reverse_indices=ri)
; Step 3: Get the first one in each bin, and put back in sorted order
keep = ri[ri[where(h gt 0)]]
keep = keep[sort(keep)]
; Step 4: Print them out:
print, input[*,keep]
1 10 9 100 200
2 11 8 101 201
4 7 12 99 199
-Jeremy.
|
|
|