comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Autocorrelation with (LOTS) of missing data.
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Autocorrelation with (LOTS) of missing data. [message #59140] Tue, 11 March 2008 10:27 Go to next message
jameskuyper is currently offline  jameskuyper
Messages: 79
Registered: October 2007
Member
I've got a time series 807793 bins long, with missing data in all but
48945 of those bins. Only 7392 of those bins have a non-zero event
count. Those bins have a total count of about 1 million events, which
tells you that events are highly clustered, at least at the time scale
of the bin size (5 minutes).

I want to use autocorrelation analysis to investigate the clustering
of these events on longer time scales. The large amount of missing
data makes such analysis difficult, but the non-missing data is
clustered on time spans of 9 bins or so. Therefore, it seems to me
that with the right algorithm, it should be possible to estimate the
autocorrellation at lags of less than 9 bins. Does anyone know what
the right algorithm would be?
Re: Autocorrelation with (LOTS) of missing data. [message #59236 is a reply to message #59140] Fri, 14 March 2008 08:59 Go to previous message
jameskuyper is currently offline  jameskuyper
Messages: 79
Registered: October 2007
Member
Brian Larsen wrote:
> On Mar 11, 1:27�pm, jameskuy...@verizon.net wrote:
>> I've got a time series 807793 bins long, with missing data in all but
>> 48945 of those bins. Only 7392 of those bins have a non-zero event
>> count. Those bins have a total count of about 1 million events, which
>> tells you that events are highly clustered, at least at the time scale
>> of the bin size (5 minutes).
>>
>> I want to use autocorrelation analysis to investigate the clustering
>> of these events on longer time scales. The large amount of missing
>> data makes such analysis difficult, but the non-missing data is
>> clustered on time spans of 9 bins or so. Therefore, it seems to me
>> that with the right algorithm, it should be possible to estimate the
>> autocorrellation at lags of less than 9 bins. Does anyone know what
>> the right algorithm would be?
>
> Seems to me that this is an issue, I would use normal techniques on
> subsets of the data. There might be other ways but clusters of
> missing data are kinda like small data sets.

The individual clusters are too small to calculculate meaningful
autocorrelation values; I would need to know an appropriate way to
combine autocorrelation functions calculated from different sets of
varying lengths.

I've found an article <http://sankhya.isical.ac.in/search/
61a2/61a27036.pdf> which describes three estimators that can be used
for this purpose. I was hoping I could use code that had already been
written, but it should be pretty straightforward to write a program to
calculate those estimators.
Re: Autocorrelation with (LOTS) of missing data. [message #59266 is a reply to message #59140] Wed, 12 March 2008 12:48 Go to previous message
Brian Larsen is currently offline  Brian Larsen
Messages: 270
Registered: June 2006
Senior Member
On Mar 11, 1:27 pm, jameskuy...@verizon.net wrote:
> I've got a time series 807793 bins long, with missing data in all but
> 48945 of those bins. Only 7392 of those bins have a non-zero event
> count. Those bins have a total count of about 1 million events, which
> tells you that events are highly clustered, at least at the time scale
> of the bin size (5 minutes).
>
> I want to use autocorrelation analysis to investigate the clustering
> of these events on longer time scales. The large amount of missing
> data makes such analysis difficult, but the non-missing data is
> clustered on time spans of 9 bins or so. Therefore, it seems to me
> that with the right algorithm, it should be possible to estimate the
> autocorrellation at lags of less than 9 bins. Does anyone know what
> the right algorithm would be?

Seems to me that this is an issue, I would use normal techniques on
subsets of the data. There might be other ways but clusters of
missing data are kinda like small data sets.


Cheers,

Brian

------------------------------------------------------------ --------------
Brian Larsen
Boston University
Center for Space Physics
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: WindowsXP IDLWorkbench hang issue
Next Topic: WindowsXP IDLWorkbench hang issue

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 19:30:56 PDT 2025

Total time taken to generate the page: 0.00555 seconds