Slow Coyote Graphics

QUESTION: In general, I like Coyote Graphics routines and I find them to be much, much faster than the new function graphics routines in IDL 8.x. But, occasionally, I am doing something and I find them to be extremely slow. I notice it most when I am trying to draw lots of symbols. I used to use the IDL PlotS command, and it was wicked fast. But, the same code with the cgPlotS command is wicked slow. What's going on?

ANSWER: Yes, the bad news is there are occasions when the Coyote Library routines can be much, much slower than the traditional IDL routines they replace. The good news is this kind of speed problem can almost always be overcome with some small changes to the code and a better understanding of the underlying principles of the Coyote Graphics system.

NOTE: Please note that a 14 December 2012 update to the Coyote Library has been able to dramatically speed up the code that originally prompted this article. If you haven't updated your Coyote Library in some time, now would be a great time to do so!

As you know, the Coyote Graphics System is built on top of the traditional IDL graphics system, which is fast because it is built very close to the machine, if you will. Coyote Library routines necessarily take you a step farther way from the machine, and do require some overhead to make the graphics routines have modern properties. In essence, a Coyote Graphics routine like cgPlotS is nothing more than a sophisticated wrapper to the IDL PlotS command, but some of that sophistication slows it down.

Let me give you an example. Consider the PointSourceOverlay program to overlay point source data on a map. The essential command in this code is the command that plots nearly 60 thousand symbols on the map. The variable indices in this command is an array of 59,878 elements. The colors passed to the Color keyword are byte values ranging in value from 1 to 8. (The program colors were initially loaded into color indices 1 through 8.) This, obviously, is a traditonal IDL command, while almost all the other graphics commands in the program are Coyote Graphics routines. Why is this command different?

   PlotS, lon[indices], lat[indices], PSym=symbol, $
      Color=soilc_colors[indices], SymSize=0.5

If we run the program and time it, we see the program takes a little over a second to run. If we simply change "PlotS" to "cgPlotS" and run it again, we see the program takes over a minute to run. In other words, it runs about 100 times slower. What accounts for this change and this slowness with the Coyote Graphics routine?

In a word, cgColor, the heart and soul of the Coyote Graphics library. If we run the slower program with the Profiler turned on, we see that almost all of the speed difference is due to time spent in the cgColor command.

This is because Coyote Graphics routines draw everything using color decomposition mode, if at all possible. This prevents the color table from becoming contaminated by drawing colors, since in color decomposition mode, colors do not need to be loaded into the color table at all. But, here is the problem.

The writer of this program wants to perform the drawing in indexed color mode. That's why the colors were loaded into the color table at indices 1 through 8 at the start of the program, and why the user is supplying byte values as the colors. The cgPlotS program, on the other hand, doesn't care what the user wants to do, it wants to draw in decomposed color mode. To manage that, it needs to turn those byte values into strings. In other words, it turns the value 7B into the value "7". (Actually cgPlotS isn't doing this, cgDefaultColor is doing this.) Then, cgColor turns each string color "7" into the 24-bit color that is loaded at color index 7 in the current color table. And, of course, it has to do this 59,878 times! Doing this one color at a time is a lot of overhead and it results in a slow-running program.

The program could be improved by separating the conversion of the byte values to strings outside of cgPlotS. For example, we could try something like this, in which the conversion of bytes to strings occurs only in cgColor.

   colors = cgColor(soilc_colors[indices])
   cgPlotS, lon[indices], lat[indices], PSym=symbol, $
      Color=colors, SymSize=0.5

But, this version of the program still takes about 30 seconds to render. We have improved the speed by a factor of two, but it is still too slow.

What we really would like to do is pass long integers to the cgPlotS command so we can avoid the conversion of strings to long integers. This requires two things. That we set the program to draw in decomposed color, and that have some way of doing the conversion to long integers.

Recall that soilc_colors[indices] are byte values between 1 and 8. Originally in this program, the colors were defined like this:

   soil_colors = ['purple', 'dodger blue', 'dark green', 'lime green', $
                  'green yellow', 'yellow', 'hot pink', 'crimson']

If we converted just these eight strings to longs, we might have a faster way of specifying the 59,878 long integers we need. We proceed like this.

   SetDecomposedState, 1, Current=currentState
   longColors = cgColor(soil_colors)
   colors = longColors[soilc_colors[indices]-1]
   cgPlotS, lon[indices], lat[indices], PSym=symbol, $
      Color=colors, SymSize=0.5
   SetDecomposedState, currentState

This version of the program runs in about 2.5 seconds. This is about twice as slow as the traditional IDL command itself, but a speed-up of a factor of 50 over the original Coyote Graphic implementation and certainly much faster than the same program written in function graphics. I think it is an acceptable trade-off for the improved functionality and device independence of Coyote Graphics commands.

But, the bottom line is this. Sometimes the Coyote Graphics commands can be slower than you want them to be. When that is the case, you can always revert to traditional IDL commands in your programs. The Coyote Library has many routines to help you do just that.

Example with the cgColorFill Command

Here is another example of a slow program with a cgColorFill program. The program draws small polygons on a map. There are 360*120 polygons drawn. The color values (in the color_values variable) are byte values in the range 0 to 255. This is also a 360 by 120 array, and the call to cgColorFill is in a loop, like this:

 FOR i=0,nx-1 DO BEGIN
    FOR j=0,ny-1 DO BEGIN
       cgColorFill, pol_lon, pol_lat, COLOR=color_values[i,j], /DATA
    ENDFOR
 ENDFOR

This program took about 12 seconds to run, primarily because cgColorFill internally has to call cgColor 360*120 times to turn a byte value into a 24-bit color value. Again, this is a lot of overhead!

But, there are at most only 256 possible byte values. What if we did the conversion up front on just those 256 values? We modify the color_values array like this, and use a color value look-up table. This is fast, because we use IDL's fast array handling capability.

   cgcolors = cgColor(Bindgen(256))
   color_values = cgcolors[color_values]

Now, the color_values array contains 24-bit color values, not byte values. These can be passed directly to the Polyfill command, bypassing cgColorFill (and, by extension, cgColor) completely. The loop now looks like this.

 FOR i=0,nx-1 DO BEGIN
    FOR j=0,ny-1 DO BEGIN
       PolyFill, pol_lon, pol_lat, COLOR=color_values[i,j], /DATA
    ENDFOR
 ENDFOR

By simply making this change, the program now runs in about 0.725 seconds, which is over 16 times faster than before.

Version of IDL used to prepare this article: IDL 7.1.2.

Written: 21 July 2012