comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » NaN Magic or Why Me?!
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
NaN Magic or Why Me?! [message #56761] Tue, 13 November 2007 09:42 Go to next message
Mort Canty is currently offline  Mort Canty
Messages: 134
Registered: March 2003
Senior Member
Hi. Here's a trick with IDL: I'm running a neural network program (IDL
6.3, Windows XP) which is printing out a cost function as it goes:

1.41914e+006 332780. 328493.
363745. 327434. 333099.
334286. 320390. 371921.
349013. 304647. 323247.
314879. 312924. 386168.
362931. 346616. 306625.
340137. 309213. 302789.
302587. 313369. 310620.
301379. 307119. 302866.
307930. 295240. 304881.
301841. 317709. 292358.
321169. 297112. 391706.
297928. 305506. 299264.
315475. 288123. 292904.
308228. 294505. 288769.
305784. 299221. 359080.
295104. 297367. 302781.
290047. 277969. 285225.
279158. 283448. 306711.
282240. 283696. 283543.
294153. -NaN 280851.
281700. 303358.

Did you see the NaN? Now watch: I restart the program under identical
conditions:

1.41914e+006 332780. 328493.
363745. 327434. 333099.
334286. 320390. 371921.
349013. 304647. 323247.
314879. 312924. 386168.
362931. 346616. 306625.
340137. 309213. 302789.
302587. 313369. 310620.
301379. 307119. 302866.
307930. 295240. 304881.
301841. 317709. 292358.
321169. 297112. 391706.
297928. 305506. NaN
315475. 288123. 292904.
308228. 294505. 288769
...

Cool huh? Sequence is identical but the NaN has moved (and changed sign
to boot). That's not all. Now I logout and logon under a different user
account. Same program, same IDL, same conditions:

1.41914e+006 332780. 328493.
363745. 327434. 333099.
334286. 320390. 371921.
349013. 304647. 323247.
314879. 312924. 386168.
362931. 346616. 306625.
340137. 309213. 302789.
302587. 313369. 310620.
301379. 307119. 302866.
307930. 295240. 304881.
301841. 317709. 292358.
321169. 297112. 391706.
297928. 305506. 299264.
315475. 288123. 292904.
308228. 294505. 288769.
305784. 299221. 359080.
295104. 297367. 302781.
290047. 277969. 285225.
279158. 283448. 306711.
282240. 283696. 283543.
294153. 446049. 280851.
281700. 303358. 355329.
288240. 297747. 315557.
287779. 314566. 272712.
290404. 278610. 281469.
297105.

Where'd it go?. Now the program runs forever without any NaNs. I'd like
to post the code so you, too, can amaze your friends, but unfortunately
the trick only works on my bloody computer.

Seriously, can someone, anyone, tell me might be going on?

Cheers

Mort

PS It's not a memory error. I replaced the DIMMS completely.
Re: NaN Magic or Why Me?! [message #56815 is a reply to message #56761] Wed, 14 November 2007 23:34 Go to previous message
Mort Canty is currently offline  Mort Canty
Messages: 134
Registered: March 2003
Senior Member
Peter Mason schrieb:
> I'll add four cents to the kitty.
>
> One of the people I work with had frequent but unpredictable crashes with
> iTools on his Dell laptop. (Utter crashes right out of IDL, as with
> invalid memory access.) We couldn't repeat these crashes on other PCs
> around here. Coyote-like bug, aside from the lack of subtlety. Turns out
> he needed a video driver update. That fixed it.
>
> I have noticed a rare floating-point bug with CISCO VPN Client version 4.x.
> With this client running and connected, I get the occasional floating-point
> corruption in my program. (A regular app, not some kind of network thing.)
> I'm pretty sure it's caused by VPN client although I don't know exactly how.
> My program is all in C - no IDL - but I would expect the same thing applies
> to any code that gets stuck into a serious amount of FP calculations. The
> problem seems fixed in CISCO VPN client version 5.
>
> Peter
>
>
Thanks very much, Peter. As it happens I do run CISCO VPN 4.8 on the
computer that's giving the problem. I'll check that and also the video
driver. I'm now sure it's not an IDL programming error. I can generate
the NaNs in any intensive FP calculation with large arrays.

And thanks to everyone else who passed on their good advice (including
David's plug for the Mac - one never knows ...)

Mort Canty
Re: NaN Magic or Why Me?! [message #56819 is a reply to message #56761] Wed, 14 November 2007 16:47 Go to previous message
Peter Mason is currently offline  Peter Mason
Messages: 145
Registered: June 1996
Senior Member
I'll add four cents to the kitty.

One of the people I work with had frequent but unpredictable crashes with
iTools on his Dell laptop. (Utter crashes right out of IDL, as with
invalid memory access.) We couldn't repeat these crashes on other PCs
around here. Coyote-like bug, aside from the lack of subtlety. Turns out
he needed a video driver update. That fixed it.

I have noticed a rare floating-point bug with CISCO VPN Client version 4.x.
With this client running and connected, I get the occasional floating-point
corruption in my program. (A regular app, not some kind of network thing.)
I'm pretty sure it's caused by VPN client although I don't know exactly how.
My program is all in C - no IDL - but I would expect the same thing applies
to any code that gets stuck into a serious amount of FP calculations. The
problem seems fixed in CISCO VPN client version 5.

Peter
Re: NaN Magic or Why Me?! [message #56823 is a reply to message #56761] Wed, 14 November 2007 10:06 Go to previous message
Mort Canty is currently offline  Mort Canty
Messages: 134
Registered: March 2003
Senior Member
Rick Towler schrieb:
>>
>> The errors occur on an ACER Veriton 7800 with intel P4 650, and
>> Lakeport-G i945G chipset running XP Professional SP2. So I guess it
>> has nothing to do with the MS update. I get no errors on my ASUS
>> core-duo laptop with the same OS (it's not here just now, so I don't
>> know the details). Ditto on an older P4 no-name with same software.
>> I'll try a few more, though, thanks.
>
> Since you code runs fine on 2 out of 3 computers the problem most likely
> is hardware based. Since you've swapped your SO-DIMM, and this is a
> laptop, you don't have many options left.
>
> RAM->BIOS->Heat->Power
>
> Even late in a processor's life cycle, there are a number of "errata"
> that exist. Generally workarounds are implemented either in the BIOS or
> in the OS. I would get the latest BIOS available for your laptop. Also,
> try disabling hyperthreading in the BIOS. You never know...
>
> If your processor has active cooling ensure that the fan and heatsink
> are clean and that the fan spins freely. If this is an actual P4 650
> "Prescott", aka Pres-Hot, heat is a real issue. Stick it in the
> refrigerator while you test it.
>
> Power issues are tricky to diagnose in laptops but if the above steps
> don't help, and you tend to get errors while the system is loaded, there
> could be components in your AC/DC adapter or the regulators on the
> mainboard that are marginal. All you really can do is look for obvious
> defects like bulging or leaking capacitors or swap the adapter if you can.


Thanks, Rick. I'll take it to heart. Actually it's the other way round.
I said that the Veriton 7800 (a desktop) is causing the problems, not
the ASUS laptop. So the refrigerator option will be a bit sticky :-)

Mort
Re: NaN Magic or Why Me?! [message #56824 is a reply to message #56761] Wed, 14 November 2007 09:56 Go to previous message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
Rick Towler writes:

> Since you code runs fine on 2 out of 3 computers the problem most likely
> is hardware based. Since you've swapped your SO-DIMM, and this is a
> laptop, you don't have many options left.
>
> RAM->BIOS->Heat->Power
>
> Even late in a processor's life cycle, there are a number of "errata"
> that exist. Generally workarounds are implemented either in the BIOS or
> in the OS. I would get the latest BIOS available for your laptop.
> Also, try disabling hyperthreading in the BIOS. You never know...
>
> If your processor has active cooling ensure that the fan and heatsink
> are clean and that the fan spins freely. If this is an actual P4 650
> "Prescott", aka Pres-Hot, heat is a real issue. Stick it in the
> refrigerator while you test it.
>
> Power issues are tricky to diagnose in laptops but if the above steps
> don't help, and you tend to get errors while the system is loaded, there
> could be components in your AC/DC adapter or the regulators on the
> mainboard that are marginal. All you really can do is look for obvious
> defects like bulging or leaking capacitors or swap the adapter if you can.

I wonder if this is the appropriate time to mention that those
new Macs look pretty darn good! :-)

Cheers,

David
--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Sepore ma de ni thui. ("Perhaps thou speakest truth.")
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Efficient way to split an array of strings.
Next Topic: Re: Efficient way to split an array of strings.

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 11:44:05 PDT 2025

Total time taken to generate the page: 0.00689 seconds