comp.lang.idl-pvwave archive: archive » strange behaviour of bytscl by large arrays

Home » Public Forums » archive » strange behaviour of bytscl by large arrays

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: strange behaviour of bytscl by large arrays [message #80151 is a reply to message #80015]

Thu, 26 April 2012 09:29

Karl[1]
Messages: 79
Registered: October 2005

Member

On Wednesday, April 25, 2012 1:24:15 PM UTC-6, alx wrote:
> On 25 avr, 19:53, Karl <Karl.W.Schu...@gmail.com> wrote:
>> On Tuesday, April 24, 2012 9:40:14 AM UTC-6, Chris Torrence wrote:
>>> On Tuesday, April 24, 2012 8:50:46 AM UTC-6, alx wrote:
>>>> On 23 avr, 22:22, Chris Torrence <gorth...@gmail.com> wrote:
>>>> > On Monday, April 23, 2012 10:14:21 AM UTC-6, fawltyl...@gmail.com wrote:
>>
>>>> > > I think IDL's FINDGEN() implementation is wrong: it uses a float counter instead of an integer one. The following test shows the difference:
>>
>>>> > > pro test
>>>> > > cpu, tpool_nthreads=1
>>>> > > n=10l^8
>>>> > > nn=n-1
>>>> > > a1=findgen(n) ; real FINDGEN()
>>>> > > a2=fltarr(n)
>>>> > > count=0.0
>>>> > > for j=0l, nn do a2[j]=count++ ; IDL's implementation
>>>> > > a3=fltarr(n)
>>>> > > count=0ll
>>>> > > for j=0l, nn do a3[j]=count++ ; better implementation
>>>> > > print, a1[nn], a2[nn], a3[nn], format='(3F15.3)'
>>>> > > end
>>
>>>> > > (Multithreading must be disabled because the starting values for the threads are calculated as an integer. So the result of FINDGEN() depends on the number of your CPU cores, too :-)
>>
>>>> > > regards,
>>>> > > Lajos
>>
>>>> > Well, wrong is perhaps too strong of a word. The real word is "fast". I just did a test where I changed the internal implementation of FINDGEN to use an integer counter. The "float" counter is 4 times faster than using an integer counter and converting it to floats.
>>
>>>> > However, perhaps we could look at the size of the input array, and switch to using the slower integer counter if it was absolutely necessary. I'll give it a thought.
>>
>>>> > Thanks for reporting this!
>>
>>>> > Cheers,
>>>> > Chris
>>>> > Exelis VIS
>>
>>>> It is risky to write a statement like "findgen(n)" while n is larger
>>>> than the inverse of the floating point precision (given in IDL by
>>>> long(1/machar().eps)). This is true in any programming language. It is
>>>> mathematically incorrect to assume that such a "findgen" will behave
>>>> as a "lindgen".
>>>> IDL is not "wrong" here, but rather clever. Is'nt it ?
>>>> alx.
>>
>>> Okay, alx has convinced me to not change anything. Try the following:
>>
>>> IDL> print, 16777216 + findgen(10), format='(f25.0)'
>>> 16777216.
>>> 16777216.
>>> 16777218.
>>> 16777220.
>>> 16777220.
>>> 16777220.
>>> 16777222.
>>> 16777224.
>>> 16777224.
>>> 16777224.
>>
>>> So even if you did the computation using long64's, as soon as you convert them back to floats, you are going to get "jumps" in the findgen because of the loss of precision. I suppose you could argue that this might be better than having the findgen get "stuck" on the number 16777216, but I think the speed of findgen is more important.
>>
>>> Thanks.
>>
>>> -Chris
>>> Exelis VIS
>>
>> Hi Chris,
>>
>> Interesting problem. FINDGEN is probably one of the oldest functions in IDL and it is hard to imagine that it can still need some attention.
>>
>> I'd argue that skipping is better than getting stuck. Apps that use FINDGEN up in this range are going to have to be aware of the precision issues. Those that do properly take this into account would expect the skips and shouldn't be penalized by the "stuck" behavior.
>>
>> If you stay with the "stuck" implementation, then you'd have to document that the behavior is undefined for n > 1/eps.
>>
>> Implementation-wise, couldn't you keep the performance by using the float for the first part of the fill, and then switch to an integer for the rest? This would retain the performance for the more common use cases.
>>
>> Karl
>>
>>
>
> No, the performance penalty - as far as understand it - would not be
> due to some counting, in float, integer or whatever else, but to the
> needed integer to float conversions to get a floating vector. The
> solution is of the responsability of the user which should know that
> "findgen" can be only used up to about 16 10^6, and "dindgen" must be
> used beyond. It would be certainly useful to have this reminder in the
> documentation.
> alain.

Right, if the user does not want any skips past the 1/eps mark, they should use dindgen.

And I agree that the usefulness of the findgen output with the expected skips past the 1/eps mark is dubious. One example that comes to mind is a dense graph or chart in that range where the skips would be hard to notice. The argument is that having the skips is better than having the values be "stuck" at 1/eps, which would lead to a constant value in that part of the graph in my example. I admit this is all pretty weak; I just think that getting the result of the function closer to the right answer within machine limitations is slightly better.

Yes, the performance problem is in the conversion. I was suggesting:

for float f = 0.0 to min(16777215.0, n)
*pFltVector++ = f
f += 1.0
endfor

if n >= 16777216
for long i = 16777216 to n
*pFltVector++ = (float)i
i += 1
endfor
endif

which preserves the performance in the most often used cases.

Karl

Report message to a moderator

[Message index]

		strange behaviour of bytscl by large arrays By: Klemen on Mon, 23 April 2012 07:01
		Re: strange behaviour of bytscl by large arrays By: lecacheux.alain on Thu, 26 April 2012 09:00
		Re: strange behaviour of bytscl by large arrays By: Lajos Foldy on Thu, 26 April 2012 09:51
		Re: strange behaviour of bytscl by large arrays By: Lajos Foldy on Thu, 26 April 2012 07:59
		Re: strange behaviour of bytscl by large arrays By: David Fanning on Wed, 25 April 2012 14:37
		Re: strange behaviour of bytscl by large arrays By: Kenneth P. Bowman on Wed, 25 April 2012 14:11
		Re: strange behaviour of bytscl by large arrays By: lecacheux.alain on Wed, 25 April 2012 12:24
		Re: strange behaviour of bytscl by large arrays By: Karl[1] on Thu, 26 April 2012 09:29
		Re: strange behaviour of bytscl by large arrays By: Karl[1] on Wed, 25 April 2012 10:53
		Re: strange behaviour of bytscl by large arrays By: lecacheux.alain on Tue, 24 April 2012 13:03
		Re: strange behaviour of bytscl by large arrays By: David Fanning on Tue, 24 April 2012 10:54
		Re: strange behaviour of bytscl by large arrays By: Lajos Foldy on Tue, 24 April 2012 10:30
		Re: strange behaviour of bytscl by large arrays By: manodeep@gmail.com on Tue, 24 April 2012 10:10
		Re: strange behaviour of bytscl by large arrays By: chris_torrence@NOSPAM on Tue, 24 April 2012 08:40
		Re: strange behaviour of bytscl by large arrays By: lecacheux.alain on Tue, 24 April 2012 07:50
		Re: strange behaviour of bytscl by large arrays By: Craig Markwardt on Tue, 24 April 2012 05:32
		Re: strange behaviour of bytscl by large arrays By: Klemen on Tue, 24 April 2012 04:14
		Re: strange behaviour of bytscl by large arrays By: Lajos Foldy on Sat, 05 May 2012 04:39
		Re: strange behaviour of bytscl by large arrays By: Craig Markwardt on Fri, 04 May 2012 22:27
		Re: strange behaviour of bytscl by large arrays By: ben.bighair on Fri, 04 May 2012 09:04
		Re: strange behaviour of bytscl by large arrays By: chris_torrence@NOSPAM on Fri, 04 May 2012 08:46

Previous Topic:	Re: Time series.
Next Topic:	IDL crashes after Mac OSX 10.7.3 upgrade from 10.7.2

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Wed Oct 08 17:37:41 PDT 2025

Total time taken to generate the page: 0.00495 seconds