comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » a serving of value_locate with a side of histogram
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
a serving of value_locate with a side of histogram [message #67115] Sun, 21 June 2009 21:06
Jeremy Bailin is currently offline  Jeremy Bailin
Messages: 618
Registered: April 2008
Senior Member
On Jun 21, 3:46 pm, Michael Galloy <mgal...@gmail.com> wrote:
> David Fanning wrote:
>> Jeremy Bailin writes:
>
>>> Yeah, value_locate is very handy for problems like this! I
>>> particularly like using it as a precursor to histogram - i.e. if you
>>> want to do something fancy using reverse_indices but don't have
>>> uniformly-spaced bins, first use value_locate to get integer indices
>>> and then use histogram to do the heavy lifting.
>
>> All right, I'll bite. Let's see an example of this.
>> Maybe you can write an article and become the JD Smith
>> of Value_Locate. :-)
>
> No article, but I think this is what Jeremy is talking about:
>
> IDL> ; get some random data
> IDL> d = randomu(12345678L, 20)
> IDL> print, d
>       0.765989    0.0234537     0.589727     0.535102     0.982231
> 0.693016     0.328147
>       0.295642     0.849918     0.592262     0.558133     0.534926
> 0.541119     0.594831
>       0.410172     0.928598     0.161021     0.928724     0.952072
> 0.522173
>
> IDL> ; specify cutoffs
> IDL> cutoffs = [0.3, 0.4, 0.8]
>
> IDL> ; compute index of "bin" to put each value into
> IDL> bins = value_locate(cutoffs, d) + 1L
> IDL> print, ind
>             2           0           2           2           3
> 2           1           0
>             3           2           2           2           2
> 2           2           3
>             0           3           3           2
>
> IDL> ; compute histogram of bins
> IDL> h = histogram(bins, reverse_indices=r)
> IDL> print, h
>             3           1          11           5
>
> IDL> ; values less than 0.3
> IDL> print, d[r[r[0]:r[1] - 1]]
>      0.0234537     0.295642     0.161021
>
> IDL> ; values between 0.3 and 0.4
> IDL> print, d[r[r[1]:r[2] - 1]]
>       0.328147
>
> IDL> ; values between 0.4 and 0.8
> IDL> print, d[r[r[2]:r[3] - 1]]
>       0.765989     0.589727     0.535102     0.693016     0.592262
> 0.558133     0.534926
>       0.541119     0.594831     0.410172     0.522173
>
> IDL> ; values greater than 0.8
> IDL> print, d[r[r[3]:r[4] - 1]]
>       0.982231     0.849918     0.928598     0.928724     0.952072
>
> Mike
> --www.michaelgalloy.com
> Associate Research Scientist
> Tech-X Corporation

Yes, that's exactly the sort of thing. Here's another good example:

Let's say you have 100,000 3D data points with each coordinate lying
between 0 and 1, and you want to divide up the space into a grid
10,000 x 10,000 x 10,000 and determine which grid cells contain more
than one point. It sounds like a good job for HIST_ND:

h = hist_nd([[x],[y],[z]], min=0., max=1., nbins=10000)
print, array_indices(h, where(h gt 1))

The only problem is that it would require an array of one trillion
elements that takes up 4TB of memory!

But in reality, at least 99.99% of those cells must be empty. First,
let's get a list of which cells contain any points:

xbin = floor(x / 1e-4)
ybin = floor(y / 1e-4)
zbin = floor(z / 1e-4)
bin = x + 10000ULL*(y + 10000ULL*z)
sortedbin = bin[sort(bin)]
uniquebins = sortedbin[uniq(sortedbin)]

Now we can use uniquebins as a mapping function. The values of "bin"
can range from 0 to 999999999999, but each one of those appears in
"uniquebins", which has at most 100000 elements. So we replace each
element in bin with *its location within uniquebins*:

mappedbin = value_locate(uniquebins, bin)

All elements of mappedbin are integers between 0 and 10000, which
makes a perfectly reasonable histogram:

h = histogram(mappedbin, min=0)

The grid locations with multiple points are then:

multipleindex = where(h gt 1, nmulti)
if nmulti gt 0 then begin
xmulti = multipleindex mod 10000ull
ymulti = (multipleindex mod 100000000ull) / 10000ull
zmulti = multipleindex / 100000000ull
endif

...and, of course, if you want to do anything with the data points
that fall in those bins, you can do everything you'd normally do using
reverse_indices.

This is exactly the technique I used in WITHINSPHRAD_VEC3D from this
discussion:
http://groups.google.com/group/comp.lang.idl-pvwave/browse_t hread/thread/8be6763806211e75/e38f27055c7fa6e0?lnk=gst&q =withinsphrad_vec3d#e38f27055c7fa6e0

I'll try to put together a bigger value_locate article later this
week. :-)=

-Jeremy.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: base widget resize event
Next Topic: Re: a serving of value_locate with a side of histogram

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Fri Oct 10 09:11:19 PDT 2025

Total time taken to generate the page: 0.79925 seconds