Using the Profiler in IDL
QUESTION: My IDL program seems to run slowly, but I have no idea why it runs slowly. Is there any way I can determine which is the slowest part, so I can work on speeding that up?
ANSWER: There are all kinds of reasons why an IDL program might run slowly. But if you are going to fix the problem, it will help to know exactly what part of the program needs fixing. JD Smith, in an IDL newsgroup article, outlines one approach to discovering what is really slow in your program by introducing the Profiler command in IDL.
When it comes to understanding why a given bit of IDL code is so slow, there really is no need to speculate, given how easy using the IDL profiling tool is. Just make sure your routines of interest are compiled (typically just by running your program once), and then type commands like this.
IDL> Profiler, /SYSTEM & Profiler IDL> run_your_code_here IDL> Profiler, /REPORT
You will get instant access to a nice table summarizing the number of times each routine (compiled user routine or system routine) is called, how long each routine took per call and in total, etc. Then work a bit on the slowest function or functions, re-compile, and reset the profiler for the next run by typing this.
IDL> Profiler, /RESET
Run the slow operation again, get another report, and check to see whether your changes improved things.
Here is an example in which I am interested in learning how the TVImage command works.
IDL> .compile tvimage Compiled module: CMCONGRID. Compiled module: TVIMAGE_ERROR. Compiled module: TVIMAGE. IDL> Profiler, /SYSTEM & Profiler IDL> TVImage, Loaddata(7) Compiled module: LOADDATA. Compiled module: FILEPATH. Compiled module: PATH_SEP. IDL> Profiler, /REPORT Module Type Count Only(s) Avg.(s) Time(s) Avg.(s) BYTARR (S) 1 0.000147 0.000147 0.000147 0.000147 CATCH (S) 1 0.000002 0.000002 0.000002 0.000002 CEIL (S) 2 0.000011 0.000005 0.000011 0.000005 CMCONGRID (U) 1 0.000032 0.000032 0.031311 0.031311 DEVICE (S) 4 0.000043 0.000011 0.000043 0.000011 FINDGEN (S) 2 0.000015 0.000007 0.000015 0.000007 FLOAT (S) 5 0.000020 0.000004 0.000020 0.000004 FREE_LUN (S) 1 0.000115 0.000115 0.000115 0.000115 INTERPOLATE (S) 1 0.031256 0.031256 0.031256 0.031256 KEYWORD_SET (S) 22 0.000019 0.000001 0.000019 0.000001 N_ELEMENTS (S) 11 0.000012 0.000001 0.000012 0.000001 N_PARAMS (S) 1 0.000002 0.000002 0.000002 0.000002 ON_ERROR (S) 3 0.000014 0.000005 0.000014 0.000005 OPENR (S) 1 0.041714 0.041714 0.041714 0.041714 PROFILER (S) 1 0.000007 0.000007 0.000007 0.000007 READU (S) 1 0.016212 0.016212 0.016212 0.016212 ROUND (S) 2 0.000008 0.000004 0.000008 0.000004 SIZE (S) 4 0.000026 0.000006 0.000026 0.000006 STRLEN (S) 1 0.000006 0.000006 0.000006 0.000006 STRMID (S) 1 0.000015 0.000015 0.000015 0.000015 STRUPCASE (S) 2 0.000009 0.000005 0.000009 0.000005 TOTAL (S) 1 0.000019 0.000019 0.000019 0.000019 TV (S) 1 0.024151 0.024151 0.024151 0.024151 TVIMAGE (U) 1 0.000097 0.000097 0.055688 0.055688 WHERE (S) 1 0.000011 0.000011 0.000011 0.000011
Note that only the TVImage and CMCongrid user routines are profiled (type U). (All other routines profiled were system routines, type S.) The TVImage_Error routine was compiled, but wasn't called, and the LoadData, Filepath, and Path_Sep routines were not compiled at the time when Profiler is called, so they are not profiled. If you wanted, say LoadData to be profiled, too, you would have to CLEAR and RESET the Profiler, as described below. Otherwise, it would not be included in subsequent profiles, even though it is now compiled.
Obviously, operations like a=[a,findgen(1000)] don't get a separate line item in the Profiler report, but it's usually easy to figure out their impact by examining the “Only” column of the report, which tallies the time spent only in the body of a routine, not in other routines that routine calls. Consider, for example, a routine that looks something like this.
PRO My_Routine a = Do_Something(FOO=foo) b = Readu(a,foo) ... FOR i=0,n_elements(foo)-1 DO b=[b, Randomu(sd,1000)] END
It is fairly easy to figure out what that repetitive array concatenation is doing to you by looking at the “Only” column for MY_ROUTINE. If there is too much “body” in a routine to easily tell where the problem spots are using this method, consider breaking the body up into some temporary routines which Profiler can track.
I've just begun using the Profiler regularly, and really appreciate how useful it is. For instance, if you have a complex GUI operation which is slow, running in a widget in the background, just invoke profiler, run the slow operation, generate the report, find and fix the slow parts, recompile, and repeat -- all without ever quitting the running widget program! Try that in other programming languages.
Note that unfortunately the documentation for Profiler is a bit difficult to parse. It took me forever to understand that:
adds all currently compiled user routines to the list of things to profile, while:
adds only system routines, and nothing else. So, if you want both system routines and the currently compiled user routines to be profiled, you need two calls to Profiler. This is also true of clearing the profiling. For example, to totally shut down all profiling, you need:
IDL> Profiler, /CLEAR, /SYSTEM IDL> Profiler, /CLEAR, /RESET
to do the job properly. That is, “Profiler,/SYSTEM” and “Profiler” function like two completely separate tools, except for /RESET, which resets everything. Not exactly intuitive, but I can forgive them that for providing us such a quasi-miraculous profiling tool.
Copyright © 2006 David W. Fanning
Last Updated 25 January 2006