comp.lang.idl-pvwave archive: archive » Re: remove duplicate elements from a multi-dimensional array efficiently in IDL

Home » Public Forums » archive » Re: remove duplicate elements from a multi-dimensional array efficiently in IDL

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: remove duplicate elements from a multi-dimensional array efficiently in IDL [message #66329 is a reply to message #66319]

Fri, 01 May 2009 11:36

Jeremy Bailin
Messages: 618
Registered: April 2008

Senior Member

On May 1, 1:47 pm, Jeremy Bailin <astroco...@gmail.com> wrote:
> On May 1, 12:13 pm, "chenb...@gmail.com" <chenb...@gmail.com> wrote:
>
>
>
>> Hello, everyone!
>
>> Is there anyone knows a routine in IDL that be capable to remove
>> duplicate elements from a multi-dimensional array efficiently? I 'm
>> now working with huge arrays, and I have written one by myself, it
>> works but is with low efficiency.
>
>> example of my problem:
>> the input array:
>> 1,10,9,100,200
>> 2,11,8,101,201
>> 2,11,8,101,201
>> 3,10,9,100,200
>> 4,7,12,99,199
>> 2,11,8,101,201
>
>> goal:
>> remove the duplicate elements with the same values for the second and
>> the third column.
>
>> expected output:
>> 1,10,9,100,200
>> 2,11,8,101,201
>> 4,7,12,99,199
>
>> Thanks for your help!
>
>> Bo
>
> How's this:
>
> input = [[1,10,9,100,200],[2,11,8,101,201],[2,11,8,101,201],$
> [3,10,9,100,200],[4,7,12,99,199],[2,11,8,101,201]]
>
> ; Step 1: Map your columns 2 and 3 into a single unique index
> (requires ORD from JBIU):
> col2ord = ord(input[1,*])
> col3ord = ord(input[2,*])
> index = col2ord + (max(col2ord)+1)*col3ord
>
> ; Step 2: Use histogram to find which ones have the same unique index
> h = histogram(index, reverse_indices=ri)
>
> ; Step 3: Get the first one in each bin, and put back in sorted order
> keep = ri[ri[where(h gt 0)]]
> keep = keep[sort(keep)]
>
> ; Step 4: Print them out:
> print, input[*,keep]
>
> 1 10 9 100 200
> 2 11 8 101 201
> 4 7 12 99 199
>
> -Jeremy.

Incidentally, if you're dealing with huge arrays and run into memory
problems with the histogram, you can replace:

index = col2ord + (max(col2ord)+1)*col3ord

with

index = ord(col2ord + (max(col2ord)+1)*col3ord)

which will make the histogram as compact as possible.

-Jeremy.

Report message to a moderator

[Message index]

		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: vino on Wed, 06 May 2009 04:52
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: Jeremy Bailin on Mon, 04 May 2009 11:15
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: chenbo09@gmail.com on Mon, 04 May 2009 08:12
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: chenbo09@gmail.com on Sun, 03 May 2009 10:54
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: chenbo09@gmail.com on Sun, 03 May 2009 10:50
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: chenbo09@gmail.com on Sun, 03 May 2009 10:47
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: Jeremy Bailin on Sat, 02 May 2009 17:57
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: guillermo.castilla.ca on Sat, 02 May 2009 17:47
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: Jeremy Bailin on Fri, 01 May 2009 11:36
		Re: remove duplicate elements from a multi-dimensional array efficiently in IDL By: Jeremy Bailin on Fri, 01 May 2009 10:47

Previous Topic:	Trying to run ENVI_FX_DOIT example
Next Topic:	Re: Strange array division problem

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Fri Oct 10 10:07:38 PDT 2025

Total time taken to generate the page: 0.88175 seconds