comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Re: Philosophical Question about NAN
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Re: Philosophical Question about NAN [message #63672] Wed, 19 November 2008 08:09 Go to next message
Foldy Lajos is currently offline  Foldy Lajos
Messages: 268
Registered: October 2001
Senior Member
On Wed, 19 Nov 2008, R.G. Stockwell wrote:

> Also, it is not clear on how to handle a NAN - it is ambiguous.

Even IDL is confused. For a single element input, /cumul in TOTAL should
be a no-op. Let's try it:

IDL> print, !version
{ x86_64 linux unix linux 7.0 Oct 25 2007 64 64}
IDL> a=complex(1.0, !values.f_nan)
IDL> print, total(a, /nan)
( 0.00000, 0.00000)
IDL> print, total(a, /nan, /cumul)
( 1.00000, 0.00000)


regards,
lajos
Re: Philosophical Question about NAN [message #63680 is a reply to message #63672] Wed, 19 November 2008 06:32 Go to previous messageGo to next message
R.G. Stockwell is currently offline  R.G. Stockwell
Messages: 363
Registered: July 1999
Senior Member
"David Fanning" <news@dfanning.com> wrote in message
news:MPG.238b3491ef337cc798a534@news.giganews.com...
> Folks,
>
> I've had a couple of run-ins lately with NANs and I wonder
> why routines like TOTAL and MEAN don't have the NAN keyword
> set to 1 by default. Why does the user have to set it?


My two cents. In spectral analysis, a 'missing point' or NAN has
a profound effect. The FFT _must_ have the behaviour it currently
has (it returns all nans, if there are any nans).


The position/time of each
measurement is critical in an FFT, and you can't just throw a number away
and translate all the other points.

Also, it is not clear on how to handle a NAN - it is ambiguous.
Does one throw the point away (like in a MEAN() calculation), or
does one interpolate the data (like in an FFT() application)?
Automated interpolation can be a very bad idea (and what happens
if the NAN results in an extrapolation, cause there are no surrounding
points?) Imho, the programmer has to make the decision about how to
handle NANs.

Also, where data is missing can be important even in functions like MEAN().
Suppose you have a year of temperature data taken every hour, but in winter,
you only have daytime data (i.e. NANs at night).
If you take monthly data, you will get the a very wrong result in terms of
the
annual variation of monthly means, and it will be artificial, and you won't
know about it.

cheers,
bob
Re: Philosophical Question about NAN [message #63698 is a reply to message #63680] Tue, 18 November 2008 12:55 Go to previous messageGo to next message
R.Bauer is currently offline  R.Bauer
Messages: 1424
Registered: November 1998
Senior Member
Paolo schrieb:
>
> Reimar Bauer wrote:
>> Paolo schrieb:
>>> On the other hand,
>>> NAN works much better than fixed values for
>>> plots! (for instance, if nan=!values.f_nan
>>> a=[1.0,2,nan,4,2]
>>> will give a much better plot than if nan=-999,
>>> even if one has a good yrange).
>>>
>>> Ciao,
>>> Paolo
>>
>> the same is true for Inf values
>
> Well, when I need to plot data with missing
> values, I put in NANs in my array. If I wouldn't,
> I would have to loop over the valid data chuncks
> to do a nice plot...now, we don't want to do that,
> do we? So I hold on to my point...
>
> Ciao,
> Paolo
>

I do understand your point,
we have tons of finite based routines in our library.

e.g.
http://www.fz-juelich.de/icg/icg-1/idl_icglib/idl_source/idl _work/fh_lib/f_eq.pro

My point is that some routines do need a keyword and others not.
However the default is it should not be mixed up.

And plot should not behave exactly the same for Inf and NaN data. But it
does. So the result is not well defined.

Reimar


>
>> inf = 1.0 / 0
>> a = [1.0, 2, inf, 4,2]
>> plot, a
>>
>> print, finite(a)
>> 1 1 0 1 1
>>
>> Just something is possible it does not make it automatically a great
>> solution.
>>
>>
>> Reimar
>>
>>
>>> Reimar Bauer wrote:
>>>> Sometimes I wish people would use a defined missing value instead on
>>>> NaN. NaN is only defined for float and double.
>>>> If a NaN value is in you data everything can become difficult.
>>>>
>>>> IDL> a=[!values.f_nan,0,3,5]
>>>> IDL> print,max(a)
>>>> NaN
>>>> IDL> print,min(a)
>>>> NaN
>>>> IDL> if a[0] gt 1 then print, 'yes' else print, 'no'
>>>> no
>>>> IDL> if a[0] lt 1 then print, 'yes' else print, 'no'
>>>> no
>>>> IDL> if a[0] eq 1 then print, 'yes' else print, 'no'
>>>> no
>>>>
>>>> if you have read until here you may wonder about this
>>>> IDL> if !values.f_nan eq !values.f_nan then print,'yes' else print, 'no'
>>>> no
>>>>
>>>> Idl says "no"!!
>>>>
>>>> For functions we can easily set a key so that NaN numbers can be handled
>>>> differently but if the default is to search for NaN a lot of other
>>>> places needs a lot of changes.
>>>>
>>>> cheers
>>>>
>>>> Reimar
>>>>
>>>>
>>>> Kenneth P. Bowman schrieb:
>>>> > In article <MPG.238b3491ef337cc798a534@news.giganews.com>,
>>>> > David Fanning <news@dfanning.com> wrote:
>>>> >
>>>> >> Folks,
>>>> >>
>>>> >> I've had a couple of run-ins lately with NANs and I wonder
>>>> >> why routines like TOTAL and MEAN don't have the NAN keyword
>>>> >> set to 1 by default. Why does the user have to set it?
>>>> >>
>>>> >> I understand the argument that the NAN capability was
>>>> >> added as an afterthought (or more likely when someone
>>>> >> standardized the NAN bit pattern), and so the functionality
>>>> >> was added as an optional addition that enhanced the function
>>>> >> rather than changed it. But really...is there a reason
>>>> >> why it is not the default now?
>>>> >>
>>>> >> One could argue, I suppose, that having a program stumble
>>>> >> over a NAN alerts you to its presence in your data. That
>>>> >> is useful, certainly. But, typically, once I add a NAN
>>>> >> keyword to my code, I don't know (nor do I or care) if the
>>>> >> argument has NANs. Is this lazy programming on my part?
>>>> >>
>>>> >> I am just wondering whether not setting the default value
>>>> >> of the NAN keyword to 1 on routines like TOTAL, MEAN,
>>>> >> et. al is the functional equivalent of not setting the
>>>> >> default values of the COLOR and BITS_PER_PIXEL keywords
>>>> >> to the PostScript device to something useful by default.
>>>> >> That is, an act of negligence on the part of the
>>>> >> manufacturer.
>>>> >>
>>>> >> What say you?
>>>> >>
>>>> >> Cheers,
>>>> >>
>>>> >> David
>>>> > HI David,
>>>> >
>>>> > I think they chose correctly and erred on the side of safety.
>>>> >
>>>> > If I know there are Nans in my data, I'll take care of it.
>>>> >
>>>> > If there are Nans in the data that I don't expect, I don't want to
>>>> > have to set a keyword somewhere to find that out. That is, I don't
>>>> > want IDL to automatically skip those Nans.
>>>> >
>>>> > OTOH, I still find this to be frustrating and dangerous
>>>> >
>>>> > IDL> PRINT, TOTAL(REPLICATE(!VALUES.F_NAN, 5), /NAN)
>>>> > 0.00000
>>>> >
>>>> > There are no valid numbers in the input vector, but TOTAL
>>>> > returns a valid FLOAT. This makes the NAN keyword useless
>>>> > in many situations.
>>>> >
>>>> > Ken
Re: Philosophical Question about NAN [message #63705 is a reply to message #63698] Tue, 18 November 2008 08:42 Go to previous messageGo to next message
pgrigis is currently offline  pgrigis
Messages: 436
Registered: September 2007
Senior Member
Reimar Bauer wrote:
> Paolo schrieb:
>> On the other hand,
>> NAN works much better than fixed values for
>> plots! (for instance, if nan=!values.f_nan
>> a=[1.0,2,nan,4,2]
>> will give a much better plot than if nan=-999,
>> even if one has a good yrange).
>>
>> Ciao,
>> Paolo
>
>
> the same is true for Inf values

Well, when I need to plot data with missing
values, I put in NANs in my array. If I wouldn't,
I would have to loop over the valid data chuncks
to do a nice plot...now, we don't want to do that,
do we? So I hold on to my point...

Ciao,
Paolo


>
> inf = 1.0 / 0
> a = [1.0, 2, inf, 4,2]
> plot, a
>
> print, finite(a)
> 1 1 0 1 1
>
> Just something is possible it does not make it automatically a great
> solution.
>
>
> Reimar
>
>
>>
>> Reimar Bauer wrote:
>>> Sometimes I wish people would use a defined missing value instead on
>>> NaN. NaN is only defined for float and double.
>>> If a NaN value is in you data everything can become difficult.
>>>
>>> IDL> a=[!values.f_nan,0,3,5]
>>> IDL> print,max(a)
>>> NaN
>>> IDL> print,min(a)
>>> NaN
>>> IDL> if a[0] gt 1 then print, 'yes' else print, 'no'
>>> no
>>> IDL> if a[0] lt 1 then print, 'yes' else print, 'no'
>>> no
>>> IDL> if a[0] eq 1 then print, 'yes' else print, 'no'
>>> no
>>>
>>> if you have read until here you may wonder about this
>>> IDL> if !values.f_nan eq !values.f_nan then print,'yes' else print, 'no'
>>> no
>>>
>>> Idl says "no"!!
>>>
>>> For functions we can easily set a key so that NaN numbers can be handled
>>> differently but if the default is to search for NaN a lot of other
>>> places needs a lot of changes.
>>>
>>> cheers
>>>
>>> Reimar
>>>
>>>
>>> Kenneth P. Bowman schrieb:
>>>> In article <MPG.238b3491ef337cc798a534@news.giganews.com>,
>>>> David Fanning <news@dfanning.com> wrote:
>>>>
>>>> > Folks,
>>>> >
>>>> > I've had a couple of run-ins lately with NANs and I wonder
>>>> > why routines like TOTAL and MEAN don't have the NAN keyword
>>>> > set to 1 by default. Why does the user have to set it?
>>>> >
>>>> > I understand the argument that the NAN capability was
>>>> > added as an afterthought (or more likely when someone
>>>> > standardized the NAN bit pattern), and so the functionality
>>>> > was added as an optional addition that enhanced the function
>>>> > rather than changed it. But really...is there a reason
>>>> > why it is not the default now?
>>>> >
>>>> > One could argue, I suppose, that having a program stumble
>>>> > over a NAN alerts you to its presence in your data. That
>>>> > is useful, certainly. But, typically, once I add a NAN
>>>> > keyword to my code, I don't know (nor do I or care) if the
>>>> > argument has NANs. Is this lazy programming on my part?
>>>> >
>>>> > I am just wondering whether not setting the default value
>>>> > of the NAN keyword to 1 on routines like TOTAL, MEAN,
>>>> > et. al is the functional equivalent of not setting the
>>>> > default values of the COLOR and BITS_PER_PIXEL keywords
>>>> > to the PostScript device to something useful by default.
>>>> > That is, an act of negligence on the part of the
>>>> > manufacturer.
>>>> >
>>>> > What say you?
>>>> >
>>>> > Cheers,
>>>> >
>>>> > David
>>>> HI David,
>>>>
>>>> I think they chose correctly and erred on the side of safety.
>>>>
>>>> If I know there are Nans in my data, I'll take care of it.
>>>>
>>>> If there are Nans in the data that I don't expect, I don't want to
>>>> have to set a keyword somewhere to find that out. That is, I don't
>>>> want IDL to automatically skip those Nans.
>>>>
>>>> OTOH, I still find this to be frustrating and dangerous
>>>>
>>>> IDL> PRINT, TOTAL(REPLICATE(!VALUES.F_NAN, 5), /NAN)
>>>> 0.00000
>>>>
>>>> There are no valid numbers in the input vector, but TOTAL
>>>> returns a valid FLOAT. This makes the NAN keyword useless
>>>> in many situations.
>>>>
>>>> Ken
Re: Philosophical Question about NAN [message #63718 is a reply to message #63705] Tue, 18 November 2008 05:05 Go to previous messageGo to next message
Jeremy Bailin is currently offline  Jeremy Bailin
Messages: 618
Registered: April 2008
Senior Member
On Nov 17, 9:58 am, David Fanning <n...@dfanning.com> wrote:
> Folks,
>
> I've had a couple of run-ins lately with NANs and I wonder
> why routines like TOTAL and MEAN don't have the NAN keyword
> set to 1 by default. Why does the user have to set it?
>
> I understand the argument that the NAN capability was
> added as an afterthought (or more likely when someone
> standardized the NAN bit pattern), and so the functionality
> was added as an optional addition that enhanced the function
> rather than changed it. But really...is there a reason
> why it is not the default now?
>
> One could argue, I suppose, that having a program stumble
> over a NAN alerts you to its presence in your data. That
> is useful, certainly. But, typically, once I add a NAN
> keyword to my code, I don't know (nor do I or care) if the
> argument has NANs. Is this lazy programming on my part?
>
> I am just wondering whether not setting the default value
> of the NAN keyword to 1 on routines like TOTAL, MEAN,
> et. al is the functional equivalent of not setting the
> default values of the COLOR and BITS_PER_PIXEL keywords
> to the PostScript device to something useful by default.
> That is, an act of negligence on the part of the
> manufacturer.
>
> What say you?
>
> Cheers,
>
> David
> --
> David Fanning, Ph.D.
> Fanning Software Consulting, Inc.
> Coyote's Guide to IDL Programming:http://www.dfanning.com/
> Sepore ma de ni thui. ("Perhaps thou speakest truth.")

My 2 cents... is that about 75% of the time that my data ends up
having NaNs in it, it's not intentional and is a sign of something
screwy. So by not enabling /NAN by default, debugging becomes much
simpler - it's immediately obvious if the result is NaN that
something's gone wrong, while it's not obvious if it gives me some
real but incorrect number.

-Jeremy.
Re: Philosophical Question about NAN [message #63720 is a reply to message #63718] Tue, 18 November 2008 00:18 Go to previous messageGo to next message
R.Bauer is currently offline  R.Bauer
Messages: 1424
Registered: November 1998
Senior Member
Paolo schrieb:
> On the other hand,
> NAN works much better than fixed values for
> plots! (for instance, if nan=!values.f_nan
> a=[1.0,2,nan,4,2]
> will give a much better plot than if nan=-999,
> even if one has a good yrange).
>
> Ciao,
> Paolo


the same is true for Inf values

inf = 1.0 / 0
a = [1.0, 2, inf, 4,2]
plot, a

print, finite(a)
1 1 0 1 1

Just something is possible it does not make it automatically a great
solution.


Reimar


>
> Reimar Bauer wrote:
>> Sometimes I wish people would use a defined missing value instead on
>> NaN. NaN is only defined for float and double.
>> If a NaN value is in you data everything can become difficult.
>>
>> IDL> a=[!values.f_nan,0,3,5]
>> IDL> print,max(a)
>> NaN
>> IDL> print,min(a)
>> NaN
>> IDL> if a[0] gt 1 then print, 'yes' else print, 'no'
>> no
>> IDL> if a[0] lt 1 then print, 'yes' else print, 'no'
>> no
>> IDL> if a[0] eq 1 then print, 'yes' else print, 'no'
>> no
>>
>> if you have read until here you may wonder about this
>> IDL> if !values.f_nan eq !values.f_nan then print,'yes' else print, 'no'
>> no
>>
>> Idl says "no"!!
>>
>> For functions we can easily set a key so that NaN numbers can be handled
>> differently but if the default is to search for NaN a lot of other
>> places needs a lot of changes.
>>
>> cheers
>>
>> Reimar
>>
>>
>> Kenneth P. Bowman schrieb:
>>> In article <MPG.238b3491ef337cc798a534@news.giganews.com>,
>>> David Fanning <news@dfanning.com> wrote:
>>>
>>>> Folks,
>>>>
>>>> I've had a couple of run-ins lately with NANs and I wonder
>>>> why routines like TOTAL and MEAN don't have the NAN keyword
>>>> set to 1 by default. Why does the user have to set it?
>>>>
>>>> I understand the argument that the NAN capability was
>>>> added as an afterthought (or more likely when someone
>>>> standardized the NAN bit pattern), and so the functionality
>>>> was added as an optional addition that enhanced the function
>>>> rather than changed it. But really...is there a reason
>>>> why it is not the default now?
>>>>
>>>> One could argue, I suppose, that having a program stumble
>>>> over a NAN alerts you to its presence in your data. That
>>>> is useful, certainly. But, typically, once I add a NAN
>>>> keyword to my code, I don't know (nor do I or care) if the
>>>> argument has NANs. Is this lazy programming on my part?
>>>>
>>>> I am just wondering whether not setting the default value
>>>> of the NAN keyword to 1 on routines like TOTAL, MEAN,
>>>> et. al is the functional equivalent of not setting the
>>>> default values of the COLOR and BITS_PER_PIXEL keywords
>>>> to the PostScript device to something useful by default.
>>>> That is, an act of negligence on the part of the
>>>> manufacturer.
>>>>
>>>> What say you?
>>>>
>>>> Cheers,
>>>>
>>>> David
>>> HI David,
>>>
>>> I think they chose correctly and erred on the side of safety.
>>>
>>> If I know there are Nans in my data, I'll take care of it.
>>>
>>> If there are Nans in the data that I don't expect, I don't want to
>>> have to set a keyword somewhere to find that out. That is, I don't
>>> want IDL to automatically skip those Nans.
>>>
>>> OTOH, I still find this to be frustrating and dangerous
>>>
>>> IDL> PRINT, TOTAL(REPLICATE(!VALUES.F_NAN, 5), /NAN)
>>> 0.00000
>>>
>>> There are no valid numbers in the input vector, but TOTAL
>>> returns a valid FLOAT. This makes the NAN keyword useless
>>> in many situations.
>>>
>>> Ken
Re: Philosophical Question about NAN [message #63728 is a reply to message #63720] Mon, 17 November 2008 17:07 Go to previous messageGo to next message
pgrigis is currently offline  pgrigis
Messages: 436
Registered: September 2007
Senior Member
On the other hand,
NAN works much better than fixed values for
plots! (for instance, if nan=!values.f_nan
a=[1.0,2,nan,4,2]
will give a much better plot than if nan=-999,
even if one has a good yrange).

Ciao,
Paolo

Reimar Bauer wrote:
> Sometimes I wish people would use a defined missing value instead on
> NaN. NaN is only defined for float and double.
> If a NaN value is in you data everything can become difficult.
>
> IDL> a=[!values.f_nan,0,3,5]
> IDL> print,max(a)
> NaN
> IDL> print,min(a)
> NaN
> IDL> if a[0] gt 1 then print, 'yes' else print, 'no'
> no
> IDL> if a[0] lt 1 then print, 'yes' else print, 'no'
> no
> IDL> if a[0] eq 1 then print, 'yes' else print, 'no'
> no
>
> if you have read until here you may wonder about this
> IDL> if !values.f_nan eq !values.f_nan then print,'yes' else print, 'no'
> no
>
> Idl says "no"!!
>
> For functions we can easily set a key so that NaN numbers can be handled
> differently but if the default is to search for NaN a lot of other
> places needs a lot of changes.
>
> cheers
>
> Reimar
>
>
> Kenneth P. Bowman schrieb:
>> In article <MPG.238b3491ef337cc798a534@news.giganews.com>,
>> David Fanning <news@dfanning.com> wrote:
>>
>>> Folks,
>>>
>>> I've had a couple of run-ins lately with NANs and I wonder
>>> why routines like TOTAL and MEAN don't have the NAN keyword
>>> set to 1 by default. Why does the user have to set it?
>>>
>>> I understand the argument that the NAN capability was
>>> added as an afterthought (or more likely when someone
>>> standardized the NAN bit pattern), and so the functionality
>>> was added as an optional addition that enhanced the function
>>> rather than changed it. But really...is there a reason
>>> why it is not the default now?
>>>
>>> One could argue, I suppose, that having a program stumble
>>> over a NAN alerts you to its presence in your data. That
>>> is useful, certainly. But, typically, once I add a NAN
>>> keyword to my code, I don't know (nor do I or care) if the
>>> argument has NANs. Is this lazy programming on my part?
>>>
>>> I am just wondering whether not setting the default value
>>> of the NAN keyword to 1 on routines like TOTAL, MEAN,
>>> et. al is the functional equivalent of not setting the
>>> default values of the COLOR and BITS_PER_PIXEL keywords
>>> to the PostScript device to something useful by default.
>>> That is, an act of negligence on the part of the
>>> manufacturer.
>>>
>>> What say you?
>>>
>>> Cheers,
>>>
>>> David
>>
>> HI David,
>>
>> I think they chose correctly and erred on the side of safety.
>>
>> If I know there are Nans in my data, I'll take care of it.
>>
>> If there are Nans in the data that I don't expect, I don't want to
>> have to set a keyword somewhere to find that out. That is, I don't
>> want IDL to automatically skip those Nans.
>>
>> OTOH, I still find this to be frustrating and dangerous
>>
>> IDL> PRINT, TOTAL(REPLICATE(!VALUES.F_NAN, 5), /NAN)
>> 0.00000
>>
>> There are no valid numbers in the input vector, but TOTAL
>> returns a valid FLOAT. This makes the NAN keyword useless
>> in many situations.
>>
>> Ken
Re: Philosophical Question about NAN [message #63730 is a reply to message #63728] Mon, 17 November 2008 16:38 Go to previous messageGo to next message
R.Bauer is currently offline  R.Bauer
Messages: 1424
Registered: November 1998
Senior Member
Sometimes I wish people would use a defined missing value instead on
NaN. NaN is only defined for float and double.
If a NaN value is in you data everything can become difficult.

IDL> a=[!values.f_nan,0,3,5]
IDL> print,max(a)
NaN
IDL> print,min(a)
NaN
IDL> if a[0] gt 1 then print, 'yes' else print, 'no'
no
IDL> if a[0] lt 1 then print, 'yes' else print, 'no'
no
IDL> if a[0] eq 1 then print, 'yes' else print, 'no'
no

if you have read until here you may wonder about this
IDL> if !values.f_nan eq !values.f_nan then print,'yes' else print, 'no'
no

Idl says "no"!!

For functions we can easily set a key so that NaN numbers can be handled
differently but if the default is to search for NaN a lot of other
places needs a lot of changes.

cheers

Reimar


Kenneth P. Bowman schrieb:
> In article <MPG.238b3491ef337cc798a534@news.giganews.com>,
> David Fanning <news@dfanning.com> wrote:
>
>> Folks,
>>
>> I've had a couple of run-ins lately with NANs and I wonder
>> why routines like TOTAL and MEAN don't have the NAN keyword
>> set to 1 by default. Why does the user have to set it?
>>
>> I understand the argument that the NAN capability was
>> added as an afterthought (or more likely when someone
>> standardized the NAN bit pattern), and so the functionality
>> was added as an optional addition that enhanced the function
>> rather than changed it. But really...is there a reason
>> why it is not the default now?
>>
>> One could argue, I suppose, that having a program stumble
>> over a NAN alerts you to its presence in your data. That
>> is useful, certainly. But, typically, once I add a NAN
>> keyword to my code, I don't know (nor do I or care) if the
>> argument has NANs. Is this lazy programming on my part?
>>
>> I am just wondering whether not setting the default value
>> of the NAN keyword to 1 on routines like TOTAL, MEAN,
>> et. al is the functional equivalent of not setting the
>> default values of the COLOR and BITS_PER_PIXEL keywords
>> to the PostScript device to something useful by default.
>> That is, an act of negligence on the part of the
>> manufacturer.
>>
>> What say you?
>>
>> Cheers,
>>
>> David
>
> HI David,
>
> I think they chose correctly and erred on the side of safety.
>
> If I know there are Nans in my data, I'll take care of it.
>
> If there are Nans in the data that I don't expect, I don't want to
> have to set a keyword somewhere to find that out. That is, I don't
> want IDL to automatically skip those Nans.
>
> OTOH, I still find this to be frustrating and dangerous
>
> IDL> PRINT, TOTAL(REPLICATE(!VALUES.F_NAN, 5), /NAN)
> 0.00000
>
> There are no valid numbers in the input vector, but TOTAL
> returns a valid FLOAT. This makes the NAN keyword useless
> in many situations.
>
> Ken
Re: Philosophical Question about NAN [message #63742 is a reply to message #63730] Mon, 17 November 2008 07:32 Go to previous messageGo to next message
Kenneth P. Bowman is currently offline  Kenneth P. Bowman
Messages: 585
Registered: May 2000
Senior Member
In article <MPG.238b3491ef337cc798a534@news.giganews.com>,
David Fanning <news@dfanning.com> wrote:

> Folks,
>
> I've had a couple of run-ins lately with NANs and I wonder
> why routines like TOTAL and MEAN don't have the NAN keyword
> set to 1 by default. Why does the user have to set it?
>
> I understand the argument that the NAN capability was
> added as an afterthought (or more likely when someone
> standardized the NAN bit pattern), and so the functionality
> was added as an optional addition that enhanced the function
> rather than changed it. But really...is there a reason
> why it is not the default now?
>
> One could argue, I suppose, that having a program stumble
> over a NAN alerts you to its presence in your data. That
> is useful, certainly. But, typically, once I add a NAN
> keyword to my code, I don't know (nor do I or care) if the
> argument has NANs. Is this lazy programming on my part?
>
> I am just wondering whether not setting the default value
> of the NAN keyword to 1 on routines like TOTAL, MEAN,
> et. al is the functional equivalent of not setting the
> default values of the COLOR and BITS_PER_PIXEL keywords
> to the PostScript device to something useful by default.
> That is, an act of negligence on the part of the
> manufacturer.
>
> What say you?
>
> Cheers,
>
> David

HI David,

I think they chose correctly and erred on the side of safety.

If I know there are Nans in my data, I'll take care of it.

If there are Nans in the data that I don't expect, I don't want to
have to set a keyword somewhere to find that out. That is, I don't
want IDL to automatically skip those Nans.

OTOH, I still find this to be frustrating and dangerous

IDL> PRINT, TOTAL(REPLICATE(!VALUES.F_NAN, 5), /NAN)
0.00000

There are no valid numbers in the input vector, but TOTAL
returns a valid FLOAT. This makes the NAN keyword useless
in many situations.

Ken
Re: Philosophical Question about NAN [message #63751 is a reply to message #63742] Mon, 17 November 2008 08:28 Go to previous messageGo to next message
wlandsman is currently offline  wlandsman
Messages: 743
Registered: June 2000
Senior Member
On Nov 17, 11:02 am, David Fanning <n...@dfanning.com> wrote:
And
> I am not arguing for the elimination of the keyword, only
> that the default value could be changed. Thus, if I *was*
> experiencing a performance penalty, and I was certain I
> had good numbers, I could always set the NAN keyword to 0.

Yes, I agree with this. I have gotten in the habit of always
writing TOTAL(/NAN), but it would be nice if this were the
default.

A vaguely related wish of mine is that compile_opt idl2 finally be
made the default. I appreciate that ITTVIS wants to have backward
compatibility, but square brackets were introduced 11 years ago, and
do we still need default 16 bit integers? If there is anyone who
still wants their 11 year old software packages to run without
modification, ITTVIS could add a compile_opt idl1 (or compile_opt
idl_ancient) command.

--Wayne
Re: Philosophical Question about NAN [message #63755 is a reply to message #63742] Mon, 17 November 2008 08:02 Go to previous messageGo to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
wlandsman writes:

> I agree with the sentiment but also note that always setting /NAN
> incurs a non-trivial performance penalty, e.g.
>
> IDL> a =3D randomn(seed,10000,2000)
> IDL> t =3D systime(1) & b =3D total(a) & print,systime(1)-t
> 0.25451803
> IDL> t =3D systime(1) & b =3D total(a,/nan) & print,systime(1)-t
> 0.35278893
>
> I've thought at times that arrays should carry a hidden bit saying
> whether or not they include NaN values, but this introduces other
> overhead problems.

I guess I would argue that in the overwhelming number of
cases in my experience, the performance penalty is trivial.
I'm calling these routines a couple of times at most. And
I am not arguing for the elimination of the keyword, only
that the default value could be changed. Thus, if I *was*
experiencing a performance penalty, and I was certain I
had good numbers, I could always set the NAN keyword to 0.

Cheers,

David

--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
Re: Philosophical Question about NAN [message #63756 is a reply to message #63742] Mon, 17 November 2008 08:01 Go to previous messageGo to next message
Rainer is currently offline  Rainer
Messages: 5
Registered: November 2007
Junior Member
I wondered about that myself and speculated that checking for NaNs
might decrease performance a little. Never checked this hypothesis,
though.
I often set values deliberately to NaN, so I also need the /NAN
keyword most of the time.

Cheers,
Rainer
Re: Philosophical Question about NAN [message #63759 is a reply to message #63742] Mon, 17 November 2008 07:54 Go to previous messageGo to next message
wlandsman is currently offline  wlandsman
Messages: 743
Registered: June 2000
Senior Member
On Nov 17, 9:58 am, David Fanning <n...@dfanning.com> wrote:
> Folks,
>
> I've had a couple of run-ins lately with NANs and I wonder
> why routines like TOTAL and MEAN don't have the NAN keyword
> set to 1 by default. Why does the user have to set it?

I agree with the sentiment but also note that always setting /NAN
incurs a non-trivial performance penalty, e.g.

IDL> a = randomn(seed,10000,2000)
IDL> t = systime(1) & b = total(a) & print,systime(1)-t
0.25451803
IDL> t = systime(1) & b = total(a,/nan) & print,systime(1)-t
0.35278893

I've thought at times that arrays should carry a hidden bit saying
whether or not they include NaN values, but this introduces other
overhead problems.

--Wayne
Re: Philosophical Question about NAN [message #63796 is a reply to message #63672] Wed, 19 November 2008 16:24 Go to previous message
Mark[1] is currently offline  Mark[1]
Messages: 66
Registered: February 2008
Member
I think backward compatibility should be the overriding principle
here. So the NAN keyword defaults to unset and COMPILE_OPT IDL2
remains necessary for the new syntax. There's simply too much old code
around that shouldn't be broken unless absolutely necessary.

(Mind you, I pay very little attention to that with my own code, but
it's my code so I can do what I want!)
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: dependency tree / call graph in idl (cscope for idl)?
Next Topic: bake a cake

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 15:55:42 PDT 2025

Total time taken to generate the page: 0.00566 seconds