comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » IDL Bridge Failing When >52 Bridges are Built (IDL6.4)
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
IDL Bridge Failing When >52 Bridges are Built (IDL6.4) [message #93855] Mon, 07 November 2016 15:41 Go to next message
vanenges is currently offline  vanenges
Messages: 2
Registered: November 2016
Junior Member
Hello:

I have image analysis software written in IDL, compiled in IDL6.4, and running under runtime license. This software is running on an Ubuntu (16.04) machine. I just updated my workstation to include two 20 core Xeon processors (80 potential threads in total). Previously, I had two 12 core Xeon processors (48 potential threads).
I utilize IDL_IDLBridge commands to send fractions of a image stack to each cpu thread. Aside from all the hassle of getting IDL and licensing manager working in Ubuntu, the software is working nicely (as a side note we do however have problems with asynchronously terminating the bridge timers that we were unable to get an elegant solution for; as mentioned before on this forum).

Here is the interesting part. Upon upgrading to one 20 core Xeon processor (40 threads), our software runs just fine starting 40 bridge processes and crunching the image stack. Now, after adding the second processor (another 20 core Xeon) we are at 80 threads and our software exits when initiating the bridges. The last terminal output is:

% Loaded DLM: IDL_IDLBRIDGE

and it crashes closing everything.

I then tried compiling and running the code through the terminal instead of the runtime version. I get one extra output after % Loaded DLM: IDL_IDLBRIDGE

Aborted (core dumped)

This didn't seem very much more informative (unless it is telling to someone else?).

Now, I went back and recompiled the software to not query how many CPUs were present and start that many jobs, but instead hard coded nCPUs to 48. Just like the previous workstation, it runs the software just fine starting 48 bridges and processing the image stack.

Trying a reasonably higher number, I try nCPUs = 64. Nope, crash, but for some reason it did go farther this time. It tried initiating the bridges:

bridge xx started
bridge xx set variables
bridge xx changed directory

etc...

until it hits bridge 30

%XMANAGER: Caught unexpected error from client application. Message follows...
% IDL_IDLBRIDGE Error: Error executing asynchronous command.
% Execution halted at: READRAWLOOP_BRIDGE_TOP

I am not sure if this clues anyone in. It is weird that when 48 cores are present and 48 loops are started on the workers that this error doesn't present at bridge 30.

Now, I made a series of compiles between 64 and 48 cores (loops). The magic number, 52, no more or the XMANAGER error comes back.

It is unfortunate that the workstation upgrade is netting me x4 threads extra. I am not having this problem in Matlab (it is using all 40 cores for parallel processing).

I have been searching the net to see if this is a physical limitation hardcoded in IDL bridge (form IDL v6.4), but we have run this software on a cluster previously and that goes out to >1000 nodes just fine. Is this a specific bug with multicore IDL_Bridge code in IDL6.4? I am worried that it is in the core of IDL_Bridge and not our software, but would be open to other interpretations.

I know IDL6.4 is old, but the code is not compiling or working with newer versions of IDL (tried 8.3) and I don't have the time or expertise for a complete overhaul of "working" software.

Any suggestions would be welcomed! I am desperate to find the source of the problem.

Best,
Schuyler
Re: IDL Bridge Failing When >52 Bridges are Built (IDL6.4) [message #93856 is a reply to message #93855] Tue, 08 November 2016 02:37 Go to previous messageGo to next message
Markus Schmassmann is currently offline  Markus Schmassmann
Messages: 129
Registered: April 2016
Senior Member
On 11/08/2016 12:41 AM, vanenges@colorado.edu wrote:
> I have image analysis software written in IDL, compiled in IDL6.4,
> andrunning under runtime license. This software is running on an Ubuntu
> (16.04) machine. I just updated my workstation to include two 20 core
> Xeon processors (80 potential threads in total). Previously, I had two
> 12 core Xeon processors (48 potential threads).
> I utilize IDL_IDLBridge commands to send fractions of a image stack
> toeach cpu thread. Aside from all the hassle of getting IDL and licensing
> manager working in Ubuntu, the software is working nicely (as a side
> note we do however have problems with asynchronously terminating the
> bridge timers that we were unable to get an elegant solution for; as
> mentioned before on this forum).
>
> [...]
> Any suggestions would be welcomed! I am desperate to find the source of the problem.

first check whether using more threads than cores actually speed up your
computing.
if not, report the problem to Harris and don't care about it further
if so, i can't help you, sorry :-( Markus
Re: IDL Bridge Failing When >52 Bridges are Built (IDL6.4) [message #93860 is a reply to message #93856] Wed, 09 November 2016 08:34 Go to previous message
vanenges is currently offline  vanenges
Messages: 2
Registered: November 2016
Junior Member
> first check whether using more threads than cores actually speed up your
> computing.
> if not, report the problem to Harris and don't care about it further
> if so, i can't help you, sorry :-( Markus

Hello Markus:

We have confirmed that running with more threads decreases processing time (~20% increase). I feel this will be very significant for our overall processing time with 80 threads as we have to repeat this for many subsequent image stacks just for one dataset.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Orbit calculations from TLE
Next Topic: Logical operators

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 15:13:24 PDT 2025

Total time taken to generate the page: 0.00580 seconds