Using the Profiler in IDL

QUESTION: My IDL program seems to run slowly, but I have no idea why it runs slowly. Is there any way I can determine which is the slowest part, so I can work on speeding that up?

ANSWER: There are all kinds of reasons why an IDL program might run slowly. But if you are going to fix the problem, it will help to know exactly what part of the program needs fixing. JD Smith, in an IDL newsgroup article, outlines one approach to discovering what is really slow in your program by introducing the Profiler command in IDL.

When it comes to understanding why a given bit of IDL code is so slow, there really is no need to speculate, given how easy using the IDL profiling tool is. Just make sure your routines of interest are compiled (typically just by running your program once), and then type commands like this.

   IDL> Profiler, /SYSTEM & Profiler
   IDL> run_your_code_here
   IDL> Profiler, /REPORT

You will get instant access to a nice table summarizing the number of times each routine (compiled user routine or system routine) is called, how long each routine took per call and in total, etc. Then work a bit on the slowest function or functions, re-compile, and reset the profiler for the next run by typing this.

   IDL> Profiler, /RESET

Run the slow operation again, get another report, and check to see whether your changes improved things.

Here is an example in which I am interested in learning how the TVImage command works.

   IDL> .compile tvimage
   Compiled module: CMCONGRID.
   Compiled module: TVIMAGE_ERROR.
   Compiled module: TVIMAGE.

   IDL> Profiler, /SYSTEM & Profiler
   IDL> TVImage, Loaddata(7)
   Compiled module: LOADDATA.
   Compiled module: FILEPATH.
   Compiled module: PATH_SEP.

   IDL> Profiler, /REPORT

   Module          Type  Count     Only(s)   Avg.(s)     Time(s)   Avg.(s)
   BYTARR          (S)       1    0.000147  0.000147    0.000147  0.000147
   CATCH           (S)       1    0.000002  0.000002    0.000002  0.000002
   CEIL            (S)       2    0.000011  0.000005    0.000011  0.000005
   CMCONGRID       (U)       1    0.000032  0.000032    0.031311  0.031311
   DEVICE          (S)       4    0.000043  0.000011    0.000043  0.000011
   FINDGEN         (S)       2    0.000015  0.000007    0.000015  0.000007
   FLOAT           (S)       5    0.000020  0.000004    0.000020  0.000004
   FREE_LUN        (S)       1    0.000115  0.000115    0.000115  0.000115
   INTERPOLATE     (S)       1    0.031256  0.031256    0.031256  0.031256
   KEYWORD_SET     (S)      22    0.000019  0.000001    0.000019  0.000001
   N_ELEMENTS      (S)      11    0.000012  0.000001    0.000012  0.000001
   N_PARAMS        (S)       1    0.000002  0.000002    0.000002  0.000002
   ON_ERROR        (S)       3    0.000014  0.000005    0.000014  0.000005
   OPENR           (S)       1    0.041714  0.041714    0.041714  0.041714
   PROFILER        (S)       1    0.000007  0.000007    0.000007  0.000007
   READU           (S)       1    0.016212  0.016212    0.016212  0.016212
   ROUND           (S)       2    0.000008  0.000004    0.000008  0.000004
   SIZE            (S)       4    0.000026  0.000006    0.000026  0.000006
   STRLEN          (S)       1    0.000006  0.000006    0.000006  0.000006
   STRMID          (S)       1    0.000015  0.000015    0.000015  0.000015
   STRUPCASE       (S)       2    0.000009  0.000005    0.000009  0.000005
   TOTAL           (S)       1    0.000019  0.000019    0.000019  0.000019
   TV              (S)       1    0.024151  0.024151    0.024151  0.024151
   TVIMAGE         (U)       1    0.000097  0.000097    0.055688  0.055688
   WHERE           (S)       1    0.000011  0.000011    0.000011  0.000011

Note that only the TVImage and CMCongrid user routines are profiled (type U). (All other routines profiled were system routines, type S.) The TVImage_Error routine was compiled, but wasn't called, and the LoadData, Filepath, and Path_Sep routines were not compiled at the time when Profiler is called, so they are not profiled. If you wanted, say LoadData to be profiled, too, you would have to CLEAR and RESET the Profiler, as described below. Otherwise, it would not be included in subsequent profiles, even though it is now compiled.

Obviously, operations like a=[a,findgen(1000)] don't get a separate line item in the Profiler report, but it's usually easy to figure out their impact by examining the “Only” column of the report, which tallies the time spent only in the body of a routine, not in other routines that routine calls. Consider, for example, a routine that looks something like this.

   PRO My_Routine
     a = Do_Something(FOO=foo)
     b = Readu(a,foo)
     ...
     FOR i=0,n_elements(foo)-1 DO b=[b, Randomu(sd,1000)]
   END

It is fairly easy to figure out what that repetitive array concatenation is doing to you by looking at the “Only” column for MY_ROUTINE. If there is too much “body” in a routine to easily tell where the problem spots are using this method, consider breaking the body up into some temporary routines which Profiler can track.

I've just begun using the Profiler regularly, and really appreciate how useful it is. For instance, if you have a complex GUI operation which is slow, running in a widget in the background, just invoke profiler, run the slow operation, generate the report, find and fix the slow parts, recompile, and repeat -- all without ever quitting the running widget program! Try that in other programming languages.

Note that unfortunately the documentation for Profiler is a bit difficult to parse. It took me forever to understand that:

   IDL> Profiler

adds all currently compiled user routines to the list of things to profile, while:

   IDL> Profiler,/SYSTEM

adds only system routines, and nothing else. So, if you want both system routines and the currently compiled user routines to be profiled, you need two calls to Profiler. This is also true of clearing the profiling. For example, to totally shut down all profiling, you need:

   IDL> Profiler, /CLEAR, /SYSTEM 
   IDL> Profiler, /CLEAR, /RESET

to do the job properly. That is, “Profiler,/SYSTEM” and “Profiler” function like two completely separate tools, except for /RESET, which resets everything. Not exactly intuitive, but I can forgive them that for providing us such a quasi-miraculous profiling tool.