comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro?
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90946] Mon, 18 May 2015 06:18 Go to next message
atmospheric physics is currently offline  atmospheric physics
Messages: 121
Registered: June 2010
Senior Member
Hello,

I have two data arrays one with some missing data and the other without missing data as below:

data = [Randomu(3L, 100) * 100, !Values.F_NAN, !Values.F_NaN]

data1 = [Randomu(3L, 100) * 100]

When I use the cgPercentiles, I don't want missing data to be included.

Print, cgPercentiles(data, Percentiles=[0.25, 0.5, 0.75])
21.8058 52.4532 77.3341

Print, cgPercentiles(data1, Percentiles=[0.25, 0.5, 0.75])
21.8058 51.3569 76.6930

Does this make sense to ignore missing data from the data array while obtaining the percentiles?

Thanks in advance,
Regards,
Madhavan
Re: How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90947 is a reply to message #90946] Mon, 18 May 2015 06:28 Go to previous messageGo to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
Madhavan Bomidi writes:

> I have two data arrays one with some missing data and the other
without missing data as below:
>
> data = [Randomu(3L, 100) * 100, !Values.F_NAN, !Values.F_NaN]
>
> data1 = [Randomu(3L, 100) * 100]
>
> When I use the cgPercentiles, I don't want missing data to be included.
>
> Print, cgPercentiles(data, Percentiles=[0.25, 0.5, 0.75])
> 21.8058 52.4532 77.3341
>
> Print, cgPercentiles(data1, Percentiles=[0.25, 0.5, 0.75])
> 21.8058 51.3569 76.6930
>
> Does this make sense to ignore missing data from the data array while obtaining the percentiles?

I don't see that you have a choice. You can't do math with cow pies (or
anything else that's not a number).

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90948 is a reply to message #90946] Mon, 18 May 2015 06:36 Go to previous messageGo to next message
Helder Marchetto is currently offline  Helder Marchetto
Messages: 520
Registered: November 2011
Senior Member
On Monday, May 18, 2015 at 3:18:53 PM UTC+2, Madhavan Bomidi wrote:
> Hello,
>
> I have two data arrays one with some missing data and the other without missing data as below:
>
> data = [Randomu(3L, 100) * 100, !Values.F_NAN, !Values.F_NaN]
>
> data1 = [Randomu(3L, 100) * 100]
>
> When I use the cgPercentiles, I don't want missing data to be included.

To avoid including NaN you could use the finite() function:
data = data[where(finite(data))]
and then calculate percentiles.d
Here is an example
data = Randomu(3L, 100) * 100
data1 = [data, !Values.F_NAN, !Values.F_NaN]
data2 = data1[where(finite(data1))]
Print, cgPercentiles(data, Percentiles=[0.25, 0.5, 0.75])
Print, cgPercentiles(data1, Percentiles=[0.25, 0.5, 0.75])
Print, cgPercentiles(data2, Percentiles=[0.25, 0.5, 0.75])
In my case, I got:
27.4920 45.3172 69.3138
27.4920 45.4608 69.4824
27.4920 45.3172 69.3138
First and last line are the same and that's what you want.
As David said, if you don't have a strategy to substitute NaNs with numbers, you can't deal with them.

Cheers,
Helder

ps: the above code will not work (or give you trouble) if your data is made out of *only* NaNs. When using the where() function you should check for how many finite numbers where found...
Re: How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90949 is a reply to message #90946] Mon, 18 May 2015 06:44 Go to previous message
Matthew Argall is currently offline  Matthew Argall
Messages: 286
Registered: October 2011
Senior Member
> Does this make sense to ignore missing data from the data array while obtaining the percentiles?

Perhaps the question is: "Will excluding missing data skew the percentiles, since the total number of points will be different?" The answer is yes, but whether or not that matters depends on how you interpret the data.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Spawn under Mac Os Yosemite
Next Topic: Distance between coordinates

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 09:22:27 PDT 2025

Total time taken to generate the page: 0.00490 seconds