comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Re: idl6.3 bug on Macintel?
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Re: idl6.3 bug on Macintel? [message #50781] Wed, 18 October 2006 08:08 Go to next message
Karl Schultz is currently offline  Karl Schultz
Messages: 341
Registered: October 1999
Senior Member
On Tue, 17 Oct 2006 11:47:18 -0500, Christopher Thom wrote:

> Quoth Kenneth Bowman:
>
>>> I have no idea even how to begin tracing down the problem, or reporting
>>> the bug to RSI...so any advice most welcome!
>>>
>>> cheers
>>> chris
>>
>> There is a known problem. See this tech note
>>
>> http://www.ittvis.com/services/techtip.asp?ttid=4081
>
> Ahhhhh. Thanks for the pointer. I searched the tech tips for my error
> message, but didn't turn up anything.

There is no error message listed in the tech tip because an OS-induced
memory corruption error can manifest itself in many ways:

1) Application crashes. No error messages, only a crash log.
2) IDL could issue any one of dozens of error messages.
3) No error message at all, only silent data corruption. This is probably
the worse case situation because you may not know there's a problem and
your app could give you plausible, yet incorrect results. In fact, this
is how we found the problem here at ITTVIS - noticed that an image had a
few incorrect pixels in the ENVI application.

Faithful followers of this newsgroup may recall a similar issue with the
ppc architecture back in the OS X 10.1 or 10.2 days. The memcpy() function
is implemented with Altivec instructions to speed it up. The OS X signal
handler mechanism didn't save/restore the Altivec registers when handling
the signal. If the signal handler code used memcpy(), it would change the
state of the Altivec registers and when the main thread resumed execution
of the memcpy, the wrong stuff was in the Altivec registers, and plop!

(IDL uses a couple of signal handlers for various things, one of them
being repairing graphics windows periodically)

Memcpy() in OS X for the intel architecture uses the MMX registers for
performance. And the signal handler was failing to save/restore the MMX
registers. The situation is slightly worse here because the MMX registers
and the floating point stack regs share the same register file on these
chips. So, if the signal hander does a memcpy() or executes an FP
operation, the MMX register state changes, and the memcpy operation in the
main thread is compromised.

Here's a minimal C program that demonstrates the problem.

/*
Compile with:

cc -o pgm pgm.c

If the return value from memcmp is not zero, then the problem exists.
*/
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <strings.h>
#include <sys/time.h>

int nTimerPops = 0;
float f = 1;
static void timer_sigalrm_handler(int signo)
{
nTimerPops++;
f = 1.000001 * f; // delete this line and bcopy works ok.
}


int main(int argc, char **argv)
{
char *src, *dst;
int n = 10 * 4096 * 4096;
int i, rc;
struct itimerval interval;
/* init data that we will bcopy repeatedly */
src = (char*)malloc(n);
dst = (char*)malloc(n);
bzero(src,n);
/* set up timer and signal handler */
interval.it_value.tv_sec = 0;
interval.it_value.tv_usec = 10000;
interval.it_interval.tv_sec = 0;
interval.it_interval.tv_usec = 10000;
signal(SIGALRM, timer_sigalrm_handler);
rc = setitimer(ITIMER_REAL, &interval, NULL);
/* start bcopy operations */
for(i=0; i<50; i++) {
bcopy(src,dst,n);
rc = memcmp(src,dst,n);
printf("iter=%d memcmp=%d num pops=%d\n", i, rc, nTimerPops);
}
return 0;
}


It's both entertaining and disturbing to run this on 10.4.7 (intel) and
watch the failures. The good news is that it is fixed in 10.4.8.

> Now my dilemma...I just spent 2days re-installing 10.4.7 after discovering
> all my X apps suddenly didn't work. The proverbial rock and a hard place
> :-(

Understood. The tech tip does a good job of explaining your options. We
reported the X app problem to Apple and they said it was a duplicate, so
others have also reported the same problem. All we can do is hope that
10.4.9, or a patch for this specific problem, comes out soon.

Karl
Re: idl6.3 bug on Macintel? [message #50804 is a reply to message #50781] Tue, 17 October 2006 09:47 Go to previous messageGo to next message
Christopher Thom is currently offline  Christopher Thom
Messages: 66
Registered: October 2006
Member
Quoth Kenneth Bowman:

>> I have no idea even how to begin tracing down the problem, or reporting
>> the bug to RSI...so any advice most welcome!
>>
>> cheers
>> chris
>
> There is a known problem. See this tech note
>
> http://www.ittvis.com/services/techtip.asp?ttid=4081

Ahhhhh. Thanks for the pointer. I searched the tech tips for my error
message, but didn't turn up anything.

Now my dilemma...I just spent 2days re-installing 10.4.7 after discovering
all my X apps suddenly didn't work. The proverbial rock and a hard place
:-(

cheers
chris
Re: idl6.3 bug on Macintel? [message #50805 is a reply to message #50804] Tue, 17 October 2006 09:14 Go to previous messageGo to next message
K. Bowman is currently offline  K. Bowman
Messages: 330
Registered: May 2000
Senior Member
In article <Pine.SOC.4.64.0610171044180.29503@oddjob.uchicago.edu>,
Christopher Thom <cthom@oddjob.uchicago.edu> wrote:

> Hi,
>
> I'm getting a semi-recurring bug in idl6.3 on my intel mac (OSX 10.4.7),
> and wondering if anyone else has encountered it? IDL crashes with the
> following error message:
>
> % Array has a corrupted descriptor: <No name>.
>
> Google doesn't have anything recent or seemingly relevant. The crash is
> always on the same line of code:
>
> r=median(x)
>
> where x is a large float array
>
> IDL> help,x
> X FLOAT = Array[2093381]
>
> The problem is intermittent, in that it always crashes on the same line of
> code, but not on any given image, while processing a series of images.
>
> I have no idea even how to begin tracing down the problem, or reporting
> the bug to RSI...so any advice most welcome!
>
> cheers
> chris

There is a known problem. See this tech note

http://www.ittvis.com/services/techtip.asp?ttid=4081

Ken Bowman
Re: idl6.3 bug on Macintel? [message #50881 is a reply to message #50781] Thu, 19 October 2006 09:33 Go to previous message
Christopher Thom is currently offline  Christopher Thom
Messages: 66
Registered: October 2006
Member
Quoth Karl Schultz:

> Faithful followers of this newsgroup may recall a similar issue with the
> ppc architecture back in the OS X 10.1 or 10.2 days. The memcpy()
> function is implemented with Altivec instructions to speed it up. The
> OS X signal handler mechanism didn't save/restore the Altivec registers
> when handling the signal. If the signal handler code used memcpy(), it
> would change the state of the Altivec registers and when the main thread
> resumed execution of the memcpy, the wrong stuff was in the Altivec
> registers, and plop!
>
> (IDL uses a couple of signal handlers for various things, one of them
> being repairing graphics windows periodically)
>
> Memcpy() in OS X for the intel architecture uses the MMX registers for
> performance. And the signal handler was failing to save/restore the MMX
> registers. The situation is slightly worse here because the MMX
> registers and the floating point stack regs share the same register file
> on these chips. So, if the signal hander does a memcpy() or executes an
> FP operation, the MMX register state changes, and the memcpy operation
> in the main thread is compromised.

Thanks karl, for some excellent information. It is an incredibly
frustrating excercise in futility attempting to get any kind of
information from Apple (and their "technical" support, who are apparently
terrified of the mention of registers or library linking). The only decent
information I've had so far is from your post and the IDL tech tip. 'tis
much appreciated!

> Understood. The tech tip does a good job of explaining your options.
> We reported the X app problem to Apple and they said it was a duplicate,
> so others have also reported the same problem. All we can do is hope
> that 10.4.9, or a patch for this specific problem, comes out soon.

Yeah, the options in the tech tip were quite helpful. Since I'm reliant on
my ppc X11 apps, and really need idl to be working smoothly, I opted for
the "device, /notimer" solution, which seems to be working smoothly. I'm
2/3 the way through my data reduction without a hitch. kudos!

Fingers crossed for a patch from apple soon, but i suspect that it's been
dropped into the "wait for 10.5" basket...

cheers
chris
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: Issue with 6.3, Mac OS X 10.4, 3 displays, X11
Next Topic: Title in iPlot

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 11:36:47 PDT 2025

Total time taken to generate the page: 0.00477 seconds