comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Re: cool way to determine durations in time series
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Re: cool way to determine durations in time series [message #47058] Mon, 23 January 2006 02:22
Thomas Pfaff is currently offline  Thomas Pfaff
Messages: 15
Registered: April 2005
Junior Member
Hi all,

Thanks for this most instructive help.
Good to have you people around.


Thomas

Dick Jackson schrieb:
> Hi all,
>
> "Ben Tupper" <btupper@bigelow.org> wrote in message
> news:43d48fF1n6d7qU1@individual.net...
>
>> Thomas Pfaff wrote:
>
>
>>> [...] how can I transform an array
>>> like this
>>>
>>> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>>>
>>> into something like this
>>>
>>> durations = [2,1,3,2]
>>>
>>> which is I want to count all contiguous fields of '1's in an array.
>
>
>> I think I would use a combination of LABEL_REGION and HISTOGRAM.
>>
>>
>> ****START
>> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>>
>> nSeries= n_elements(Series)
>>
>> buffered = [0,series,0]
>> dummy = FIX(LABEL_REGION(buffered))
>> label = dummy[1:nSeries]
>>
>> H = HISTOGRAM(label, MIN = 1S)
>>
>> print, series
>> print, label
>> print, H
>> *****END
>>
>>
>> Note the you must pad series with "background" values at the endpoints.
>
>
> I have to say, that looks pretty cool, as requested! I've approached this
> another way, playing from Ben's 'buffered' array, finding where we have
> transitions from 0->1 or 1->0:
>
> ; Find where all transitions occur
> whereChange = Where(buffered[1:*] NE buffered, nChange)
>
> ; Change array to [2, m]
> whereChange = Reform(whereChange, 2, nChange/2, /Overwrite)
>
> ; Measure distance from odd transitions to even transitions
> durations = Reform(whereChange[1, *] - whereChange[0, *])
>
> In case anyone would want to know the time or memory efficiency of these (I
> was curious), I tried to optimize them as much as possible and put them
> through their paces with long series:
>
> =====
>
> PRO TimeSeriesDurations, n
>
> ;series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>
> IF N_Elements(n) EQ 0 THEN n = 1E6
> series = RandomU(seed, n) GT 0.5
>
> nSeries= n_elements(Series)
>
> buffered = [0,series,0]
>
> ; DJ method:
>
> m0 = Memory(/Current)
> t0 = SysTime(/Seconds)
>
> whereChange = Where(buffered[1:*] NE buffered, nChange)
> IF nChange EQ 0 THEN Return ; No 1's in series
> whereChange = Reform(whereChange, 2, nChange/2, /Overwrite)
>
> durations = Reform(whereChange[1, *] - whereChange[0, *])
>
> Print, 'DJ time: ', SysTime(/Seconds)-t0
> Print, 'DJ memory: ', Memory(/Current)-m0
>
> ; BT method:
>
> m0 = Memory(/Current)
> t0 = SysTime(/Seconds)
>
> ; Need to use ULong for longer series:
> ;dummy = LABEL_REGION(buffered, /ULong)
> ;label = dummy[1:nSeries]
> ; Compressed to this for efficiency:
> label = (LABEL_REGION(buffered, /ULong))[1:nSeries]
>
> H = HISTOGRAM(label, MIN = 1S)
>
> Print, 'BT time: ', SysTime(/Seconds)-t0
> Print, 'BT memory: ', Memory(/Current)-m0
>
> Print, 'Differing results: ', Total(durations NE H)
>
> END
>
> =====
>
> Running this gives:
>
> IDL> timeseriesdurations,1E6
> DJ time: 0.030999899
> DJ memory: 3003060
> BT time: 0.062000036
> BT memory: 5001132
> Differing results: 0.000000
>
> IDL> TimeSeriesDurations,1E7
> DJ time: 0.39000010
> DJ memory: 30011760
> BT time: 0.51600003
> BT memory: 50004032
> Differing results: 0.000000
>
> Any other methods out there? Hope this helps!
>
Re: cool way to determine durations in time series [message #47075 is a reply to message #47058] Fri, 20 January 2006 21:11 Go to previous message
Dick Jackson is currently offline  Dick Jackson
Messages: 347
Registered: August 1998
Senior Member
Hi all,

"Ben Tupper" <btupper@bigelow.org> wrote in message
news:43d48fF1n6d7qU1@individual.net...
> Thomas Pfaff wrote:

>> [...] how can I transform an array
>> like this
>>
>> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>>
>> into something like this
>>
>> durations = [2,1,3,2]
>>
>> which is I want to count all contiguous fields of '1's in an array.

>
> I think I would use a combination of LABEL_REGION and HISTOGRAM.
>
>
> ****START
> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>
> nSeries= n_elements(Series)
>
> buffered = [0,series,0]
> dummy = FIX(LABEL_REGION(buffered))
> label = dummy[1:nSeries]
>
> H = HISTOGRAM(label, MIN = 1S)
>
> print, series
> print, label
> print, H
> *****END
>
>
> Note the you must pad series with "background" values at the endpoints.

I have to say, that looks pretty cool, as requested! I've approached this
another way, playing from Ben's 'buffered' array, finding where we have
transitions from 0->1 or 1->0:

; Find where all transitions occur
whereChange = Where(buffered[1:*] NE buffered, nChange)

; Change array to [2, m]
whereChange = Reform(whereChange, 2, nChange/2, /Overwrite)

; Measure distance from odd transitions to even transitions
durations = Reform(whereChange[1, *] - whereChange[0, *])

In case anyone would want to know the time or memory efficiency of these (I
was curious), I tried to optimize them as much as possible and put them
through their paces with long series:

=====

PRO TimeSeriesDurations, n

;series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]

IF N_Elements(n) EQ 0 THEN n = 1E6
series = RandomU(seed, n) GT 0.5

nSeries= n_elements(Series)

buffered = [0,series,0]

; DJ method:

m0 = Memory(/Current)
t0 = SysTime(/Seconds)

whereChange = Where(buffered[1:*] NE buffered, nChange)
IF nChange EQ 0 THEN Return ; No 1's in series
whereChange = Reform(whereChange, 2, nChange/2, /Overwrite)

durations = Reform(whereChange[1, *] - whereChange[0, *])

Print, 'DJ time: ', SysTime(/Seconds)-t0
Print, 'DJ memory: ', Memory(/Current)-m0

; BT method:

m0 = Memory(/Current)
t0 = SysTime(/Seconds)

; Need to use ULong for longer series:
;dummy = LABEL_REGION(buffered, /ULong)
;label = dummy[1:nSeries]
; Compressed to this for efficiency:
label = (LABEL_REGION(buffered, /ULong))[1:nSeries]

H = HISTOGRAM(label, MIN = 1S)

Print, 'BT time: ', SysTime(/Seconds)-t0
Print, 'BT memory: ', Memory(/Current)-m0

Print, 'Differing results: ', Total(durations NE H)

END

=====

Running this gives:

IDL> timeseriesdurations,1E6
DJ time: 0.030999899
DJ memory: 3003060
BT time: 0.062000036
BT memory: 5001132
Differing results: 0.000000

IDL> TimeSeriesDurations,1E7
DJ time: 0.39000010
DJ memory: 30011760
BT time: 0.51600003
BT memory: 50004032
Differing results: 0.000000

Any other methods out there? Hope this helps!

--
Cheers,
--
-Dick

Dick Jackson / dick@d-jackson.com
D-Jackson Software Consulting / http://www.d-jackson.com
Calgary, Alberta, Canada / +1-403-242-7398 / Fax: 241-7392
Re: cool way to determine durations in time series [message #47079 is a reply to message #47075] Fri, 20 January 2006 13:50 Go to previous message
btt is currently offline  btt
Messages: 345
Registered: December 2000
Senior Member
Thomas Pfaff wrote:
> Hello everyone,
>
> I'm doing some IDL-abuse in hydrology, so my question might seem a bit
> odd, maybe.
>
> A task that is occurring quite regularly is to determine the duration of
> certain events. For example "What is the longest contiguous duration of
> stream flow below/above a certain discharge"
>
> Assuming I have an equidistant time series (e.g. one value each day)
> this basically reduces to the question of how can I transform an array
> like this
>
> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>
> into something like this
>
> durations = [2,1,3,2]
>
> which is I want to count all contiguous fields of '1's in an array.
>
> Somehow my brain wants to use HISTOGRAM for this, but I just can't see
> how to do it.
> At the moment I'm helping myself by using CONVOL(to highlight the edges)
> and WHERE(to get the differences between two adjacent edge indices) but
> as the data gets more, this becomes extremely tedious as well as memory
> consuming (see the example below). Besides, CONVOL wouldn't work if a
> series started or ended with '1's as it can't correctly apply the kernel
> to those elements.
>

I think I would use a combination of LABEL_REGION and HISTOGRAM.


****START
series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]

nSeries= n_elements(Series)

buffered = [0,series,0]
dummy = FIX(LABEL_REGION(buffered))
label = dummy[1:nSeries]

H = HISTOGRAM(label, MIN = 1S)

print, series
print, label
print, H
*****END


Note the you must pad series with "background" values at the endpoints.

Cheers,
ben
Re: cool way to determine durations in time series [message #47083 is a reply to message #47079] Fri, 20 January 2006 11:01 Go to previous message
news.qwest.net is currently offline  news.qwest.net
Messages: 137
Registered: September 2005
Senior Member
"Thomas Pfaff" <yael@gmx.net> wrote in message
news:43com4F1mh6c1U1@individual.net...
> Hello everyone,
>
> I'm doing some IDL-abuse in hydrology, so my question might seem a bit
> odd, maybe.
>
> A task that is occurring quite regularly is to determine the duration of
> certain events. For example "What is the longest contiguous duration of
> stream flow below/above a certain discharge"
>
> Assuming I have an equidistant time series (e.g. one value each day)
> this basically reduces to the question of how can I transform an array
> like this
>
> series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
>
> into something like this
>
> durations = [2,1,3,2]
>
> which is I want to count all contiguous fields of '1's in an array.
>
> Somehow my brain wants to use HISTOGRAM for this, but I just can't see
> how to do it.
> At the moment I'm helping myself by using CONVOL(to highlight the edges)
> and WHERE(to get the differences between two adjacent edge indices) but
> as the data gets more, this becomes extremely tedious as well as memory
> consuming (see the example below). Besides, CONVOL wouldn't work if a
> series started or ended with '1's as it can't correctly apply the kernel
> to those elements.
>
> Any ideas? Anyone who has seen this problem in one of his/her maths
> textbooks? Hints to literature are also highly appreciated.
>
> Thanks in advance,
>
>
> Thomas


Hi Thomas,
here is a starting point for you.
Assuming series contains integers, this piece of code gives you the
durations of
each repeated value EXCEPT for the very end of the series. Should be easy
enough to
patcht that up. Also, should be easy enough to remove the series of zeros,
and explicitly
make it do a specific number (like 1 in your example).
Also, one can apply this to your "above/below" by first applying "series =
data > threshold"
which is what I am assuming you are doing above to get your "series".

Anyways, here it is (and it is just a starting point suggestion, you will
have to
modify it a bit)

series = [1,1,0,0,0,0,1,0,1,1,1,0,0,1,1]
d = series - shift(series,1)
w =where(d ne 0)
newdur = w - shift(w,1)
newdur[0] = w[0]
print,newdur

> newdur = 2 4 1 1 3 2

Note, it is missing that last "2" points, so you have to patch that.

Cheers,
bob
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: Map_Image()
Next Topic: iTools Data Manager

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 15:36:33 PDT 2025

Total time taken to generate the page: 0.00607 seconds