comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Chunk Array Decimation
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Return to the default flat view Create a new topic Submit Reply
Re: Chunk Array Decimation [message #32464 is a reply to message #32377] Mon, 07 October 2002 08:51 Go to previous messageGo to previous message
Jaco van Gorkom is currently offline  Jaco van Gorkom
Messages: 97
Registered: November 2000
Member
"JD Smith" <jdsmith@as.arizona.edu> wrote in message
news:pan.2002.10.04.22.07.45.664757.24772@as.arizona.edu...
> ... Here's a test run with 1,000,000
> elements, each index repeated 20 times on average:
>
> Literal Accumulate Loop: 1.2411
> Reverse Indices Loop: 0.7217
> Loop-Free with Sparse Arrays: 1.1401
> FDDRIZZLE Loop: 0.7815
> Dual Histogram Loop: 0.5490
> Thinned WHERE Histogram Loop: 0.8422
> Literal Accumulate: Compiled DLM : 0.0288

JD, we mortals need a tutorial or two in order to even begin
to understand that Sparse Array trick you pulled there.
My humble contribution to this thread is to propose an
additional and hopefully simpler (or, more intuitive)
algorithm: let us first sort the data array based on the index
array, and then use a cumulative total to do the chunking.
In pseudo-code:
sortInds = SORT(inds)
totData = TOTAL(data[sortInds], /CUMULATIVE)
uniqInds = UNIQ(inds[sortInds])
subTotData = [0., totData[uniqInds]]
vec7 = subTotData[1:*] - subTotData[0:*-1]
leaving some minor issues like non-occurring indices
unresolved.

The above algorithm is by far not fast enough because of the
very slow SORT() operation. If this is replaced by a sorting
routine which is more optimised for the problem at hand, such
as, well, I'll let you guess;) , then things get much better:
h = HISTOGRAM(inds, REVERSE_INDICES=ri)
nh = N_ELEMENTS(h)
sortData = data[ inds[ri[nh+1:*]] ]
totSortData = [0., TOTAL(sortData, /CUMULATIVE)]
vec8 = totSortData[ri[1:nh]-nh-1] - $
totSortData[ri[0:nh-1]-nh-1]

On my machine this seems to be always slightly faster than the
double histogram loop.

Cheers,
Jaco


P.S. Some timings:

For 1,000,000 elements, each index repeated 20 times on
average, on a single PIII 1GHz, W2K:

Literal Accumulate Loop: 1.2778

Reverse Indices Loop: 0.9794

Loop-Free with Sparse Arrays: 1.2589

FDDRIZZLE Loop: 0.9543

Dual Histogram Loop: 0.6329

Thinned WHERE Histogram Loop: 1.0506

Simple sorting plus cumulative total: 2.6347

Histogram plus cumulative total: 0.6119



And for each index repeated only once on average:



Literal Accumulate Loop: 1.3920

Reverse Indices Loop: 8.1998 (?)

Loop-Free with Sparse Arrays: 1.8647

FDDRIZZLE Loop: 1.7135

Dual Histogram Loop: 1.3129

Thinned WHERE Histogram Loop: 1.7996

Simple sorting plus cumulative total: 2.8701

Histogram plus cumulative total: 1.2488
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: ASTER Data
Next Topic: reading slices of FITS files

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 17:45:08 PDT 2025

Total time taken to generate the page: 0.00489 seconds