comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Finding strings values common to two (large!) arrays
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Return to the default flat view Create a new topic Submit Reply
Re: Finding strings values common to two (large!) arrays [message #92085 is a reply to message #92081] Wed, 07 October 2015 12:57 Go to previous messageGo to previous message
rjp23 is currently offline  rjp23
Messages: 97
Registered: June 2010
Member
CMSET_OP looks to be working but I'm not 100% sure due to this comment in the header:

; INDEX - if set, then return a list of indices instead of the array
; values themselves. The "slower" set operations are always
; performed in this case.
;
; The indices refer to the *combined* array [A,B]. To
; clarify, in the following call: I = CMSET_OP(..., /INDEX);
; returned values from 0 to NA-1 refer to A[I], and values
; from NA to NA+NB-1 refer to B[I-NA].


When using the code like this, it is returning an array of indices that only seem to relate to the first array.

e.g. (massively simplied) A has 10 elements, B has 20 and the returned indices are an array of 7 values such as [0,1,2,5,7,8,9]

Would I not also expect indices for the elements in the second array (10-29) to also be returned going by the statement in the header?



On Wednesday, October 7, 2015 at 4:21:31 PM UTC+1, rj...@le.ac.uk wrote:
> The IDs are of the form: 2009042300230430180019
>
> I think that's too long to convert into a number (at least when I try to turn it into a long it ends up very different!)
>
> CMSET_OP looks like it's what I need. Thanks both :-)
>
>
> On Wednesday, October 7, 2015 at 4:09:29 PM UTC+1, wlandsman wrote:
>> Two points to consider:
>>
>> I second Helder's suggestions but have two additional points to consider:
>>
>> 1. Do your array A have duplicate values? And if so, do you want to find the indices of all the values, even if they are repeated? Then I would suggest using
>>
>> http://idlastro.gsfc.nasa.gov/ftp/pro/misc/match2.pro
>>
>> which will return every matching index even of duplicate values.
>>
>> 2. You say the arrays are "numerical IDs in string format". Are you able to convert these strings into numerical values? If so, the matching algorithms work faster for numerical arrays (especially integers) than for strings. I do suspect the speed difference is not important unless you have to do the matching many times.
>>
>> --Wayne
>>
>> On Wednesday, October 7, 2015 at 10:45:55 AM UTC-4, Helder wrote:
>>> On Wednesday, October 7, 2015 at 4:13:59 PM UTC+2, rj...@le.ac.uk wrote:
>>>> I have arrays of numerical IDs in string format.
>>>>
>>>> I want to find all of the indices in Array A that contain a value that is present anywhere in Array B.
>>>>
>>>> The arrays are both quite large (>1 million values) so a loop is out of the question and them being strings complicates it as well.
>>>>
>>>> Any IDL Way tips?
>>>
>>> Interesting... I guess that a set operation will do or in other words, you want to find (A) AND (B)
>>> Did you look at David's page:
>>> https://www.idlcoyote.com/tips/set_operations.html
>>>
>>> There are some good tips, among which Craig's CMSET_OP which works also on strings (but does not return indices...).
>>>
>>> I hope it helps.
>>>
>>> Cheers,
>>> Helder
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: point on an image()
Next Topic: Histogram within a colorbar

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 17:55:43 PDT 2025

Total time taken to generate the page: 0.00451 seconds