How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90946] |
Mon, 18 May 2015 06:18  |
atmospheric physics
Messages: 121 Registered: June 2010
|
Senior Member |
|
|
Hello,
I have two data arrays one with some missing data and the other without missing data as below:
data = [Randomu(3L, 100) * 100, !Values.F_NAN, !Values.F_NaN]
data1 = [Randomu(3L, 100) * 100]
When I use the cgPercentiles, I don't want missing data to be included.
Print, cgPercentiles(data, Percentiles=[0.25, 0.5, 0.75])
21.8058 52.4532 77.3341
Print, cgPercentiles(data1, Percentiles=[0.25, 0.5, 0.75])
21.8058 51.3569 76.6930
Does this make sense to ignore missing data from the data array while obtaining the percentiles?
Thanks in advance,
Regards,
Madhavan
|
|
|
Re: How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90947 is a reply to message #90946] |
Mon, 18 May 2015 06:28   |
David Fanning
Messages: 11724 Registered: August 2001
|
Senior Member |
|
|
Madhavan Bomidi writes:
> I have two data arrays one with some missing data and the other
without missing data as below:
>
> data = [Randomu(3L, 100) * 100, !Values.F_NAN, !Values.F_NaN]
>
> data1 = [Randomu(3L, 100) * 100]
>
> When I use the cgPercentiles, I don't want missing data to be included.
>
> Print, cgPercentiles(data, Percentiles=[0.25, 0.5, 0.75])
> 21.8058 52.4532 77.3341
>
> Print, cgPercentiles(data1, Percentiles=[0.25, 0.5, 0.75])
> 21.8058 51.3569 76.6930
>
> Does this make sense to ignore missing data from the data array while obtaining the percentiles?
I don't see that you have a choice. You can't do math with cow pies (or
anything else that's not a number).
Cheers,
David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thue. ("Perhaps thou speakest truth.")
|
|
|
Re: How to ignore NaNs in the data array with function cgPercentile.pro or Percentile.pro? [message #90948 is a reply to message #90946] |
Mon, 18 May 2015 06:36   |
Helder Marchetto
Messages: 520 Registered: November 2011
|
Senior Member |
|
|
On Monday, May 18, 2015 at 3:18:53 PM UTC+2, Madhavan Bomidi wrote:
> Hello,
>
> I have two data arrays one with some missing data and the other without missing data as below:
>
> data = [Randomu(3L, 100) * 100, !Values.F_NAN, !Values.F_NaN]
>
> data1 = [Randomu(3L, 100) * 100]
>
> When I use the cgPercentiles, I don't want missing data to be included.
To avoid including NaN you could use the finite() function:
data = data[where(finite(data))]
and then calculate percentiles.d
Here is an example
data = Randomu(3L, 100) * 100
data1 = [data, !Values.F_NAN, !Values.F_NaN]
data2 = data1[where(finite(data1))]
Print, cgPercentiles(data, Percentiles=[0.25, 0.5, 0.75])
Print, cgPercentiles(data1, Percentiles=[0.25, 0.5, 0.75])
Print, cgPercentiles(data2, Percentiles=[0.25, 0.5, 0.75])
In my case, I got:
27.4920 45.3172 69.3138
27.4920 45.4608 69.4824
27.4920 45.3172 69.3138
First and last line are the same and that's what you want.
As David said, if you don't have a strategy to substitute NaNs with numbers, you can't deal with them.
Cheers,
Helder
ps: the above code will not work (or give you trouble) if your data is made out of *only* NaNs. When using the where() function you should check for how many finite numbers where found...
|
|
|
|