comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Compiling file with many functions: huge performance difference between IDL and IDLDE
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38531] Wed, 17 March 2004 01:47 Go to next message
Sidney Cadot is currently offline  Sidney Cadot
Messages: 7
Registered: October 2002
Junior Member
Hi all,

For a system we're making, a rather big IDL file is generated containing
well over 12,000 function definitions, accompanied by a selector
function (see below for a rationale).

What we're seeing is that in command-line IDL, this works like a charm:
compilation of the file takes about 4--5 seconds on a reasonably fast
machine, which is acceptable.

However, when this file is compiled from within IDLDE, this takes well
over three minutes-- rougly a factor 60 increase(!)

Does anybody know what causes this, and perhaps a solution?

We tried pre-compiling the functions using a SAV file; this yields a
significant increase both in IDL (cmd line version): 3 sec, and IDLDE
(used time down to 87 seconds), but the relative difference is still
quite puzzling.

Best regards,

Sidney Cadot
Science and Technology Corp., The Netherlands




P.S. the reason we're doing this is that we need to implement a
string-based map with optiomal performance, like this:

FUNCTION f_tom
RETURN, 123
END

FUNCTION f_dick
RETURN, 456
END

FUNCTION f_harry
RETURN, 789
END

FUNCTION f, name
CATCH, error_status
IF error_status EQ 0 THEN RETURN, -1
RETURN, call_function("f_" + name)
END
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38621 is a reply to message #38531] Fri, 19 March 2004 09:26 Go to previous messageGo to next message
Craig Markwardt is currently offline  Craig Markwardt
Messages: 1869
Registered: November 1996
Senior Member
justspam03@yahoo.de (Oliver Thilmann) writes:
...
> How would you calculate a hash value from a string? In C I would
> base it on the ASCII value of the chars, but in IDL? Above mentioned
> implementation converts the string via byte() and then loops over
> the resulting array. Is there a faster way (loops always take so long)?

As JD says, I don't have my own hashtable class on my web page, but I
can send it upon request for now. As Vogel suggest, I avoid looping
by using TOTAL and a precalculated array of hash values. I modeled
some of the interface after the container class.

Craig

--
------------------------------------------------------------ --------------
Craig B. Markwardt, Ph.D. EMAIL: craigmnet@REMOVEcow.physics.wisc.edu
Astrophysics, IDL, Finance, Derivatives | Remove "net" for better response
------------------------------------------------------------ --------------
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38632 is a reply to message #38531] Fri, 19 March 2004 03:30 Go to previous messageGo to next message
mwvogel is currently offline  mwvogel
Messages: 9
Registered: February 2003
Junior Member
> You can find a hash_table implementation on the RSI user contribution
> site. Quick performance test for 12000 sets/gets:
> #hashes set/get per entry (ms)
> 13 6
> 101 0.8
> 1001 0.15
> 12000 0.08/0.05
>
> Talking about it:
> How would you calculate a hash value from a string? In C I would
> base it on the ASCII value of the chars, but in IDL? Above mentioned
> implementation converts the string via byte() and then loops over
> the resulting array. Is there a faster way (loops always take so long)?

For real short strings (up to approx 11 chars) one could replace

ascii = ulong(byte(key))
total = 0UL
for i = 0, n_elements(ascii) - 1 do begin
total = total * 37UL + ascii[i]
; 37UL is a magic number suggested by the literature
endfor
return, total
with
ascii = ULONG(BYTE(key))
total = ULONG(TOTAL(ascii *
ULONG(37D^(N_ELEMENTS(ascii)-FINDGEN(N_ELEMENTS(ascii)) -1)), /DOUBLE))
return, total

However, TOTAL() produces a double, and is therefore prone to roundoff
errors, possibly reducing the hashing efficiency. Also, I am not so sure
that the for loop is that much slower for small arrays. At least the
original code is easier to read :-)
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38634 is a reply to message #38531] Fri, 19 March 2004 02:06 Go to previous messageGo to next message
justspam03 is currently offline  justspam03
Messages: 36
Registered: October 2003
Member
JD Smith <jdsmith@as.arizona.edu> wrote in message news:<pan.2004.03.19.01.11.11.181250@as.arizona.edu>...

>>>
>>> Setting 12000 random values (key: string, value: integer):
>>> mean 0.15 ms per entry (total ~2 seconds)
>>> Random access of 12000 values from this set:
>>> mean 0.3 ms per access, (total ~3.5 s)
>>>
>
> I think he means just using linear search, ala WHERE. This
> technically is a form of hashing: it just happens to utilize just one
> hash bucket (alright, a useless form ;).
>
> Anyway, that's some clever use of function name searching for free
> hashing. IDL does not expose any internal hashing functionality, but
> Craig wrote a hash object which works reasonably well. I don't find
> it on his site, but perhaps he'd be willing to share.
>

JD is right of course in that this not a hash, but an associative
array and the linear execution time ('set' has to check for existing
keys) is due to the use of 'where'.

Performance was good enough for me, though (with only a handful of
keys per array), so I didn't care any further about using hashes.

You can find a hash_table implementation on the RSI user contribution
site. Quick performance test for 12000 sets/gets:
#hashes set/get per entry (ms)
13 6
101 0.8
1001 0.15
12000 0.08/0.05

Talking about it:
How would you calculate a hash value from a string? In C I would
base it on the ASCII value of the chars, but in IDL? Above mentioned
implementation converts the string via byte() and then loops over
the resulting array. Is there a faster way (loops always take so long)?
Cheers
Oliver
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38638 is a reply to message #38531] Thu, 18 March 2004 17:11 Go to previous messageGo to next message
JD Smith is currently offline  JD Smith
Messages: 850
Registered: December 1999
Senior Member
On Thu, 18 Mar 2004 21:40:38 +0100, Sidney Cadot wrote:

> Oliver Thilmann wrote:
>
>> Hi,
>>
>> your example is just generic for the kind of problem you want
>> to solve, I assume. Otherwise why not use a hash? A very
>> simple implementation (unsorted arrays) on a Pentium IV,
>> 2.6 GHz, IDL 6.0 yields
>>
>> Setting 12000 random values (key: string, value: integer):
>> mean 0.15 ms per entry (total ~2 seconds)
>> Random access of 12000 values from this set:
>> mean 0.3 ms per access, (total ~3.5 s)
>>
>> Is the access via call_function much faster?
>
> I'm afraid to sound terribly stupid here, but is there support for
> hashing in IDL? I haven't been able to find it.
>
> My laborious trick is nothing more than circumventing the lack of
> hashing as a standard feature in IDL (by piggybacking on the internal
> hashing IDL uses for function names). If you know a better way, I would
> be very much interested!

I think he means just using linear search, ala WHERE. This
technically is a form of hashing: it just happens to utilize just one
hash bucket (alright, a useless form ;).

Anyway, that's some clever use of function name searching for free
hashing. IDL does not expose any internal hashing functionality, but
Craig wrote a hash object which works reasonably well. I don't find
it on his site, but perhaps he'd be willing to share.

What your method fails to offer that a real hash would is the ability
to create new hash entries at run-time: all of your hash strings are
fixed at runtime, which means you could transform them beforehand to
integers (e.g. just use the sort index), and index a large static array
instead, which would be orders of magnitude faster. Strings of course
may be more convenient, but in the fixed string-space case they aren't
technically necessary. Another variant on this would be akin to a C
macro: just create a batch file for input using named variables like:

;; map_include.pro
dick=0
frank=1
harry=2
tom=3
...
zappa=11999

map=[456,222,789,123,...,777]


;; my map-using routine
pro mymaproutine
@map_include
print,map[zappa], ' is not as good as ',map[frank]
end

That will take a bit to compile, but once it compiles it should fly,
and of course allows arrays, etc. If you insist on having a string
mapper function, you can cheat:

function f, name
@map_include
ind=routine_names(name,FETCH=0)
return,map[ind]
end

This doesn't really solve the real problem, which would include the
ability to add more string keys at runtime, but it may be as much as
you need. In the meantime, write your friendly IDL support techs and
request a decent internal hash type, or at least a front end to a hash
interface!

JD
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38641 is a reply to message #38531] Thu, 18 March 2004 14:48 Go to previous messageGo to next message
Sidney Cadot is currently offline  Sidney Cadot
Messages: 7
Registered: October 2002
Junior Member
Ben Tupper wrote:

> Sidney Cadot wrote:
>
>> Mirko Vukovic wrote:
>>
>>>> P.S. the reason we're doing this is that we need to implement a
>>>> string-based map with optiomal performance, like this:
>>>>
>>>> FUNCTION f_tom
>>>> RETURN, 123
>>>> END
>>>>
>>>> FUNCTION f_dick
>>>> RETURN, 456
>>>> END
>>>>
>>>> FUNCTION f_harry
>>>> RETURN, 789
>>>> END
>>>>
>>>> FUNCTION f, name
>>>> CATCH, error_status
>>>> IF error_status EQ 0 THEN RETURN, -1
>>>> RETURN, call_function("f_" + name)
>>>> END
>>>
>>>
>>>
>>>
>>> Out of curiosity, would a structure work here:
>>> a={f_tom:123,f_dick:456,f_harry:789...} ?
>>>
>>> It could be created using create_struct.
>>>
>>> Retrieve info using
>>> a=str.f_dick
>>>
>>> Curious minds want to know :-)
>>
>>
>>
>> Your idea is sound, but I am not aware of a way to retrieve the index
>> of a tag-name based on its name.
>>
>> You assume that "f_dick" is available at compile time, whereas I need
>> to resolve the string at runtime. Something like this would work:
>>
>> i = TAG_INDEX("f_dick", str)
>> value = str.(i)
>>
>> ... But only if functionality to get a tag index can be retrieved from
>> a struct (anyone knows how to do this?) and if its fast, i.e. if IDL
>> implements it via a hash table or similar.
>>
>
> How about this ? [[code snipped]]

Thanks for the effort, but this sort of defeats the purpose of the whole
exercise, which is to have a fast mapping function. Your solution is
linear search (the TAG_NAMES and WHERE functions), which is too slow for
out application.

Best regards,

Sidney
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38645 is a reply to message #38531] Thu, 18 March 2004 12:50 Go to previous messageGo to next message
btt is currently offline  btt
Messages: 345
Registered: December 2000
Senior Member
Sidney Cadot wrote:
> Mirko Vukovic wrote:
>
>>> P.S. the reason we're doing this is that we need to implement a
>>> string-based map with optiomal performance, like this:
>>>
>>> FUNCTION f_tom
>>> RETURN, 123
>>> END
>>>
>>> FUNCTION f_dick
>>> RETURN, 456
>>> END
>>>
>>> FUNCTION f_harry
>>> RETURN, 789
>>> END
>>>
>>> FUNCTION f, name
>>> CATCH, error_status
>>> IF error_status EQ 0 THEN RETURN, -1
>>> RETURN, call_function("f_" + name)
>>> END
>>
>>
>>
>> Out of curiosity, would a structure work here:
>> a={f_tom:123,f_dick:456,f_harry:789...} ?
>>
>> It could be created using create_struct.
>>
>> Retrieve info using
>> a=str.f_dick
>>
>> Curious minds want to know :-)
>
>
> Your idea is sound, but I am not aware of a way to retrieve the index of
> a tag-name based on its name.
>
> You assume that "f_dick" is available at compile time, whereas I need to
> resolve the string at runtime. Something like this would work:
>
> i = TAG_INDEX("f_dick", str)
> value = str.(i)
>
> ... But only if functionality to get a tag index can be retrieved from a
> struct (anyone knows how to do this?) and if its fast, i.e. if IDL
> implements it via a hash table or similar.
>

How about this ?

IDL> print, tag_index(['color', 'clip', 'banana'], !P, count = count) &
print, count
4 3 -1
2




;-----start
FUNCTION TAG_INDEX, tagName, str, count = count

tags = TAG_NAMES(str)

n = n_elements(tagName)

tag_index = LonArr(n, /noZero)

Count = 0L
For i = 0L, n-1 Do begin
tag_index[i] = (WHERE(tags EQ StrUpCase(tagName[i]), cnt))[0]
Count = Count + (cnt GT 0)
EndFor

Return, tag_index
END
;-----finish

Ben
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38646 is a reply to message #38531] Thu, 18 March 2004 12:40 Go to previous messageGo to next message
Sidney Cadot is currently offline  Sidney Cadot
Messages: 7
Registered: October 2002
Junior Member
Oliver Thilmann wrote:

> Hi,
>
> your example is just generic for the kind of problem you want
> to solve, I assume. Otherwise why not use a hash? A very
> simple implementation (unsorted arrays) on a Pentium IV,
> 2.6 GHz, IDL 6.0 yields
>
> Setting 12000 random values (key: string, value: integer):
> mean 0.15 ms per entry (total ~2 seconds)
> Random access of 12000 values from this set:
> mean 0.3 ms per access, (total ~3.5 s)
>
> Is the access via call_function much faster?

I'm afraid to sound terribly stupid here, but is there support for
hashing in IDL? I haven't been able to find it.

My laborious trick is nothing more than circumventing the lack of
hashing as a standard feature in IDL (by piggybacking on the internal
hashing IDL uses for function names). If you know a better way, I would
be very much interested!

Best regards, Sidney
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38647 is a reply to message #38531] Thu, 18 March 2004 12:36 Go to previous messageGo to next message
Sidney Cadot is currently offline  Sidney Cadot
Messages: 7
Registered: October 2002
Junior Member
Andrew Cool wrote:

> OK, me dumb bunny - me no know what a string based map is.
> But based on your example above, how about this?
>
> map_array = Strarr(12000,2)
>
> map_array(0,1) = string(indgen(12000),form='(i5.5)')
> map_array(5000,0) ='dick'
>
> t = Systime(1)
> found_index = Where(map_array(*,0) EQ 'dick')
> print,'Time taken = ',Systime(1) - t,' seconds'
> print,'Found Index = ',found_index
> ret_value = map_array(found_index,1)
> print,'Returned Value = ',ret_value

This uses a linear search. This is a lot slower than IDL's internal
hashing, which is basically O(1).

> Now on my PC, Time taken = 0.00000000 seconds,
> which I'd call pretty close to "optiomal".

Try calling it a million times and compare it to my solution; you will
see a significant difference.

> Do you really need 12000 function definitions?

I have yet to see a better solution.

Best regards,

Sidney
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38648 is a reply to message #38531] Thu, 18 March 2004 12:32 Go to previous messageGo to next message
Sidney Cadot is currently offline  Sidney Cadot
Messages: 7
Registered: October 2002
Junior Member
Mirko Vukovic wrote:
>> P.S. the reason we're doing this is that we need to implement a
>> string-based map with optiomal performance, like this:
>>
>> FUNCTION f_tom
>> RETURN, 123
>> END
>>
>> FUNCTION f_dick
>> RETURN, 456
>> END
>>
>> FUNCTION f_harry
>> RETURN, 789
>> END
>>
>> FUNCTION f, name
>> CATCH, error_status
>> IF error_status EQ 0 THEN RETURN, -1
>> RETURN, call_function("f_" + name)
>> END
>
>
> Out of curiosity, would a structure work here:
> a={f_tom:123,f_dick:456,f_harry:789...} ?
>
> It could be created using create_struct.
>
> Retrieve info using
>
> a=str.f_dick
>
> Curious minds want to know :-)

Your idea is sound, but I am not aware of a way to retrieve the index of
a tag-name based on its name.

You assume that "f_dick" is available at compile time, whereas I need to
resolve the string at runtime. Something like this would work:

i = TAG_INDEX("f_dick", str)
value = str.(i)

... But only if functionality to get a tag index can be retrieved from a
struct (anyone knows how to do this?) and if its fast, i.e. if IDL
implements it via a hash table or similar.

Best regards,

Sidney
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38656 is a reply to message #38531] Thu, 18 March 2004 01:13 Go to previous messageGo to next message
justspam03 is currently offline  justspam03
Messages: 36
Registered: October 2003
Member
Hi,

your example is just generic for the kind of problem you want
to solve, I assume. Otherwise why not use a hash? A very
simple implementation (unsorted arrays) on a Pentium IV,
2.6 GHz, IDL 6.0 yields

Setting 12000 random values (key: string, value: integer):
mean 0.15 ms per entry (total ~2 seconds)
Random access of 12000 values from this set:
mean 0.3 ms per access, (total ~3.5 s)

Is the access via call_function much faster?
Cheers
Oliver
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38665 is a reply to message #38531] Wed, 17 March 2004 14:02 Go to previous messageGo to next message
andrew.cool is currently offline  andrew.cool
Messages: 47
Registered: July 2003
Member
Sidney Cadot <sidney@jigsaw.nl> wrote in message news:<1079516867.600179@euler.servers.luna.net>...
> Hi all,
>
> For a system we're making, a rather big IDL file is generated containing
> well over 12,000 function definitions, accompanied by a selector
> function (see below for a rationale).
>
> What we're seeing is that in command-line IDL, this works like a charm:
> compilation of the file takes about 4--5 seconds on a reasonably fast
> machine, which is acceptable.
>
> However, when this file is compiled from within IDLDE, this takes well
> over three minutes-- rougly a factor 60 increase(!)
>
> Does anybody know what causes this, and perhaps a solution?
>
> We tried pre-compiling the functions using a SAV file; this yields a
> significant increase both in IDL (cmd line version): 3 sec, and IDLDE
> (used time down to 87 seconds), but the relative difference is still
> quite puzzling.
>
> Best regards,
>
> Sidney Cadot
> Science and Technology Corp., The Netherlands
>
>
>
>
> P.S. the reason we're doing this is that we need to implement a
> string-based map with optiomal performance, like this:
>
> FUNCTION f_tom
> RETURN, 123
> END
>
> FUNCTION f_dick
> RETURN, 456
> END
>
> FUNCTION f_harry
> RETURN, 789
> END
>
> FUNCTION f, name
> CATCH, error_status
> IF error_status EQ 0 THEN RETURN, -1
> RETURN, call_function("f_" + name)
> END

OK, me dumb bunny - me no know what a string based map is.
But based on your example above, how about this?


map_array = Strarr(12000,2)

map_array(0,1) = string(indgen(12000),form='(i5.5)')
map_array(5000,0) ='dick'

t = Systime(1)
found_index = Where(map_array(*,0) EQ 'dick')
print,'Time taken = ',Systime(1) - t,' seconds'
print,'Found Index = ',found_index
ret_value = map_array(found_index,1)
print,'Returned Value = ',ret_value


Now on my PC, Time taken = 0.00000000 seconds,
which I'd call pretty close to "optiomal".
Do you really need 12000 function definitions?

Andrew Cool
DSTO, Adelaide, South Australia
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38668 is a reply to message #38531] Wed, 17 March 2004 12:16 Go to previous messageGo to next message
mvukovic is currently offline  mvukovic
Messages: 63
Registered: July 1998
Member
Sidney Cadot <sidney@jigsaw.nl> wrote in message news:<1079516867.600179@euler.servers.luna.net>...
> Hi all,
>
> For a system we're making, a rather big IDL file is generated containing
> well over 12,000 function definitions, accompanied by a selector
> function (see below for a rationale).
>
> What we're seeing is that in command-line IDL, this works like a charm:
> compilation of the file takes about 4--5 seconds on a reasonably fast
> machine, which is acceptable.
>
> However, when this file is compiled from within IDLDE, this takes well
> over three minutes-- rougly a factor 60 increase(!)
>
> Does anybody know what causes this, and perhaps a solution?
>
> We tried pre-compiling the functions using a SAV file; this yields a
> significant increase both in IDL (cmd line version): 3 sec, and IDLDE
> (used time down to 87 seconds), but the relative difference is still
> quite puzzling.
>
> Best regards,
>
> Sidney Cadot
> Science and Technology Corp., The Netherlands
>
>
>
>
> P.S. the reason we're doing this is that we need to implement a
> string-based map with optiomal performance, like this:
>
> FUNCTION f_tom
> RETURN, 123
> END
>
> FUNCTION f_dick
> RETURN, 456
> END
>
> FUNCTION f_harry
> RETURN, 789
> END
>
> FUNCTION f, name
> CATCH, error_status
> IF error_status EQ 0 THEN RETURN, -1
> RETURN, call_function("f_" + name)
> END

Out of curiosity, would a structure work here:
a={f_tom:123,f_dick:456,f_harry:789...} ?

It could be created using create_struct.

Retrieve info using

a=str.f_dick

Curious minds want to know :-)

(And never mind about ``curiosity kills the cat'')

Mirko
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38698 is a reply to message #38531] Mon, 22 March 2004 15:45 Go to previous message
Mark Hadfield is currently offline  Mark Hadfield
Messages: 783
Registered: May 1995
Senior Member
RM wrote:
> So why is IDLDE so
> rubbish? Don't know!

Because it won't let you compile 3000 functions per second?

I think the OP's idea is very clever, but it's asking the system to do
something that was never anticipated by the designers. (Except, perhaps,
in a nightmare.)

I must admit I'm curious as to why IDLDE addds this overhead, and
whether it does it on all platforms. I'm sorry but I don't consider
"it's so rubbish" to be an explanation.

--
Mark Hadfield "Ka puwaha te tai nei, Hoea tatou"
m.hadfield@niwa.co.nz
National Institute for Water and Atmospheric Research (NIWA)
Re: Compiling file with many functions: huge performance difference between IDL and IDLDE [message #38702 is a reply to message #38531] Mon, 22 March 2004 05:49 Go to previous message
richard.martin is currently offline  richard.martin
Messages: 3
Registered: February 2004
Junior Member
The threads seem to have got off topic again. So why is IDLDE so
rubbish? Don't know! But then I guess like all the sain people round I
use emacs-IDLWAVE precisely because IDLDE is so bad.

As for your programing task, "Searching in string spaces", it looks
very similar to the WHERE_AINB problem, which we all know should be
done by the
king of IDL routines VALUE_LOCATE (Controversial! Heresy even!) as in

ord=VALUE_LOCATE(nameindex, name)
result=(nameindex(ord) ne name) ? -1 : namevalue(ord)

The {nameindex} must be sorted.

Your "Multi-Function" method is still quicker than this (but only by a
factor of two), but the above can be vectorised very easily.

And finaly, for all of you out there that leave the complete history
of the thread in your replies, the <DELETE>-key can normaly be found
just above the
<CR>-key, top right-hand corner of the QWERTY bit, end of the row with
all the numbers on it ;-).

Rich.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: image registration, robust
Next Topic: compiling itools and utilities

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 15:10:54 PDT 2025

Total time taken to generate the page: 0.00753 seconds