comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » findfile gives 'Array has a corrupted descriptor' error
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
findfile gives 'Array has a corrupted descriptor' error [message #85335] Thu, 25 July 2013 17:44 Go to next message
b_gom is currently offline  b_gom
Messages: 105
Registered: April 2003
Senior Member
I running IDL 8.2.3 on Win7 64bit. I have an older program that uses findfile() to recursively find a set of filenames with a wildcard. I realize findfile is obsolete, but it runs *much* faster than file_search. When the findfile returns more than ~5000 files, however, I get the following error:

found=file_search(uval.path+'*',count=count,/mark_dir)
% Array has a corrupted descriptor: FOUND.

Any ideas what is causing the error?
Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?

Thanks
Re: findfile gives 'Array has a corrupted descriptor' error [message #85336 is a reply to message #85335] Thu, 25 July 2013 18:05 Go to previous messageGo to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
b_gom@hotmail.com writes:

> I running IDL 8.2.3 on Win7 64bit. I have an older program that uses findfile() to recursively find a set of filenames with a wildcard. I realize findfile is obsolete, but it runs *much* faster than file_search. When the findfile returns more than ~5000 files, however, I get the following error:
>
> found=file_search(uval.path+'*',count=count,/mark_dir)
> % Array has a corrupted descriptor: FOUND.
>
> Any ideas what is causing the error?

A bug in FindFile that occurs at about this number of files.

> Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?

No, sorry. :-)

Cheers,

David

P.S. Do you know the old joke about computers making very fast, very
accurate mistakes?



--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: findfile gives 'Array has a corrupted descriptor' error [message #85337 is a reply to message #85335] Fri, 26 July 2013 05:54 Go to previous messageGo to next message
Phillip Bitzer is currently offline  Phillip Bitzer
Messages: 223
Registered: June 2006
Senior Member
On Thursday, July 25, 2013 7:44:55 PM UTC-5, b_...@hotmail.com wrote:
> I running IDL 8.2.3 on Win7 64bit. I have an older program that uses findfile() to recursively find a set of filenames with a wildcard. I realize findfile is obsolete, but it runs *much* faster than file_search. When the findfile returns more than ~5000 files, however, I get the following error:
>
>
> found=file_search(uval.path+'*',count=count,/mark_dir)
>

Are you looking for all files in recursive directories? If so, try this on for size:

found = file_search(uval.path, '*', count=count,/mark_dir)
Re: findfile gives 'Array has a corrupted descriptor' error [message #85338 is a reply to message #85335] Fri, 26 July 2013 08:29 Go to previous messageGo to next message
wlandsman is currently offline  wlandsman
Messages: 743
Registered: June 2000
Senior Member
On Thursday, July 25, 2013 8:44:55 PM UTC-4, b_...@hotmail.com wrote:

> Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?

Not a direct answer but I do notice on the Mac that file_search() is slow only on the first call:

IDL> tic & a = file_search('.','*.pro',/nosort) & toc
% Time elapsed: 41.371398 seconds.

IDL> tic & a = file_search('.','*.pro',/nosort) & toc
% Time elapsed: 0.45945001 seconds.


So file_search() was of order 100 times faster on the second call. This is similar to the Unix find command which stores the information of a search to speed up the processing on subsequent calls.
(I included /nosort because that is supposed to speed things up somewhat but it seemed to make little difference on the Mac).

If your recursive search includes a lot of unnecessary directories, then it might be quicker to use a vector of plausible directories in you file_search() call, rather than searching every directory below the specified one. --Wayne
Re: findfile gives 'Array has a corrupted descriptor' error [message #85339 is a reply to message #85338] Fri, 26 July 2013 08:34 Go to previous messageGo to next message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
wlandsman writes:

>
> On Thursday, July 25, 2013 8:44:55 PM UTC-4, b_...@hotmail.com wrote:
>
>> Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?
>
> Not a direct answer but I do notice on the Mac that file_search() is slow only on the first call:
>
> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
> % Time elapsed: 41.371398 seconds.
>
> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
> % Time elapsed: 0.45945001 seconds.
>
>
> So file_search() was of order 100 times faster on the second call. This is similar to the Unix find command which stores the information of a search to speed up the processing on subsequent calls.
> (I included /nosort because that is supposed to speed things up somewhat but it seemed to make little difference on the Mac).
>
> If your recursive search includes a lot of unnecessary directories, then it might be quicker to use a vector of plausible directories in you file_search() call, rather than searching every directory below the specified one. --Wayne

The speed up doesn't seem to be so pronounced on Windows:

IDL> tic & a = file_search('.','*.pro',/nosort) & toc
Elapsed Time: 9.873000
IDL> tic & a = file_search('.','*.pro',/nosort) & toc
Elapsed Time: 5.622000
IDL> tic & a = file_search('.','*.pro',/nosort) & toc
Elapsed Time: 5.600000

This command found 5128 files.

Cheers,

David

--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: findfile gives 'Array has a corrupted descriptor' error [message #85340 is a reply to message #85339] Fri, 26 July 2013 10:34 Go to previous messageGo to next message
b_gom is currently offline  b_gom
Messages: 105
Registered: April 2003
Senior Member
The program in question is an old compound widget that has been working happily up until the last IDL release. I've noticed more IDL crashes with the last release, but this is the only one that has an obvious cause.

The issue with this widget is that it traverses a directory tree and builds a tree widget with the directories and any files matching a search string. This is being done with a recursive function that builds the tree nodes as it goes, which means *many* calls to findfile(). The only way file_search() would work is if I use it to return the entire directory structure, and parse the result to build the tree, which would mean a major rewrite. Being lazy, I was hoping there was a working equivalent to findfile(). Sigh.



On Friday, July 26, 2013 9:34:19 AM UTC-6, David Fanning wrote:
> wlandsman writes:
>
>
>
>>
>
>> On Thursday, July 25, 2013 8:44:55 PM UTC-4, b_...@hotmail.com wrote:
>
>>
>
>>> Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?
>
>>
>
>> Not a direct answer but I do notice on the Mac that file_search() is slow only on the first call:
>
>>
>
>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>> % Time elapsed: 41.371398 seconds.
>
>>
>
>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>> % Time elapsed: 0.45945001 seconds.
>
>>
>
>>
>
>> So file_search() was of order 100 times faster on the second call. This is similar to the Unix find command which stores the information of a search to speed up the processing on subsequent calls.
>
>> (I included /nosort because that is supposed to speed things up somewhat but it seemed to make little difference on the Mac).
>
>>
>
>> If your recursive search includes a lot of unnecessary directories, then it might be quicker to use a vector of plausible directories in you file_search() call, rather than searching every directory below the specified one. --Wayne
>
>
>
> The speed up doesn't seem to be so pronounced on Windows:
>
>
>
> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
> Elapsed Time: 9.873000
>
> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
> Elapsed Time: 5.622000
>
> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
> Elapsed Time: 5.600000
>
>
>
> This command found 5128 files.
>
>
>
> Cheers,
>
>
>
> David
>
>
>
> --
>
> David Fanning, Ph.D.
>
> Fanning Software Consulting, Inc.
>
> Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
>
> Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: findfile gives 'Array has a corrupted descriptor' error [message #85341 is a reply to message #85340] Fri, 26 July 2013 12:54 Go to previous messageGo to next message
b_gom is currently offline  b_gom
Messages: 105
Registered: April 2003
Senior Member
Welll, I've optimized the widget code and managed to reduce the number of calls to file_search to the bare minimum, but the show-stopping issue is that file_search is basically unusable on network shares (CIFS/SMB).

For example, file_search takes 26 seconds (!!) to list a folder with ~7000 files:

IDL> tic & found=file_search('U:\somenetworkshare\*',count=count) & toc
% Time elapsed: 26.115000 seconds.
IDL> tic & found=file_search('U:\somenetworkshare\*',count=count) & toc
% Time elapsed: 26.052000 seconds.
IDL> tic & found=file_search('\\server\pathtoshare\*',count=count) & toc
% Time elapsed: 26.110000 seconds.

Whereas findfile does the same job in no time (except that at random times it crashes with a 'array has corrupted descriptor' fault):

IDL> tic & found=findfile('U:\somenetworkshare\*',count=count) & toc
% Time elapsed: 0.63899994 seconds.

What in the world is file_search doing?




On Friday, July 26, 2013 11:34:24 AM UTC-6, b_...@hotmail.com wrote:
> The program in question is an old compound widget that has been working happily up until the last IDL release. I've noticed more IDL crashes with the last release, but this is the only one that has an obvious cause.
>
>
>
> The issue with this widget is that it traverses a directory tree and builds a tree widget with the directories and any files matching a search string. This is being done with a recursive function that builds the tree nodes as it goes, which means *many* calls to findfile(). The only way file_search() would work is if I use it to return the entire directory structure, and parse the result to build the tree, which would mean a major rewrite. Being lazy, I was hoping there was a working equivalent to findfile(). Sigh.
>
>
>
>
>
>
>
> On Friday, July 26, 2013 9:34:19 AM UTC-6, David Fanning wrote:
>
>> wlandsman writes:
>
>>
>
>>
>
>>
>
>>>
>
>>
>
>>> On Thursday, July 25, 2013 8:44:55 PM UTC-4, b_...@hotmail.com wrote:
>
>>
>
>>>
>
>>
>
>>>> Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?
>
>>
>
>>>
>
>>
>
>>> Not a direct answer but I do notice on the Mac that file_search() is slow only on the first call:
>
>>
>
>>>
>
>>
>
>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>> % Time elapsed: 41.371398 seconds.
>
>>
>
>>>
>
>>
>
>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>> % Time elapsed: 0.45945001 seconds.
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> So file_search() was of order 100 times faster on the second call. This is similar to the Unix find command which stores the information of a search to speed up the processing on subsequent calls.
>
>>
>
>>> (I included /nosort because that is supposed to speed things up somewhat but it seemed to make little difference on the Mac).
>
>>
>
>>>
>
>>
>
>>> If your recursive search includes a lot of unnecessary directories, then it might be quicker to use a vector of plausible directories in you file_search() call, rather than searching every directory below the specified one. --Wayne
>
>>
>
>>
>
>>
>
>> The speed up doesn't seem to be so pronounced on Windows:
>
>>
>
>>
>
>>
>
>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>> Elapsed Time: 9.873000
>
>>
>
>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>> Elapsed Time: 5.622000
>
>>
>
>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>> Elapsed Time: 5.600000
>
>>
>
>>
>
>>
>
>> This command found 5128 files.
>
>>
>
>>
>
>>
>
>> Cheers,
>
>>
>
>>
>
>>
>
>> David
>
>>
>
>>
>
>>
>
>> --
>
>>
>
>> David Fanning, Ph.D.
>
>>
>
>> Fanning Software Consulting, Inc.
>
>>
>
>> Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
>
>>
>
>> Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: findfile gives 'Array has a corrupted descriptor' error [message #85342 is a reply to message #85341] Fri, 26 July 2013 16:51 Go to previous messageGo to next message
b_gom is currently offline  b_gom
Messages: 105
Registered: April 2003
Senior Member
Some further information:

when testing on a Linux system, accessing the same CIFS share, I get the following:

IDL> tic & found=file_search('/somenetworkshare/*',count=count) & toc
% Time elapsed: 0.25748897 seconds.
IDL> tic & found=file_search('/somenetworkshare/*',count=count) & toc
% Time elapsed: 0.26749086 seconds.

For Linux, it seems that findfile is slower than file_search, but still consistent with the Windows results:
IDL> tic & found=findfile('/somenetworkshare/*',count=count) & toc
% Time elapsed: 0.54775500 seconds.


On Friday, July 26, 2013 1:54:15 PM UTC-6, b_...@hotmail.com wrote:
> Welll, I've optimized the widget code and managed to reduce the number of calls to file_search to the bare minimum, but the show-stopping issue is that file_search is basically unusable on network shares (CIFS/SMB).
>
>
>
> For example, file_search takes 26 seconds (!!) to list a folder with ~7000 files:
>
>
>
> IDL> tic & found=file_search('U:\somenetworkshare\*',count=count) & toc
>
> % Time elapsed: 26.115000 seconds.
>
> IDL> tic & found=file_search('U:\somenetworkshare\*',count=count) & toc
>
> % Time elapsed: 26.052000 seconds.
>
> IDL> tic & found=file_search('\\server\pathtoshare\*',count=count) & toc
>
> % Time elapsed: 26.110000 seconds.
>
>
>
> Whereas findfile does the same job in no time (except that at random times it crashes with a 'array has corrupted descriptor' fault):
>
>
>
> IDL> tic & found=findfile('U:\somenetworkshare\*',count=count) & toc
>
> % Time elapsed: 0.63899994 seconds.
>
>
>
> What in the world is file_search doing?
>
>
>
>
>
>
>
>
>
> On Friday, July 26, 2013 11:34:24 AM UTC-6, b_...@hotmail.com wrote:
>
>> The program in question is an old compound widget that has been working happily up until the last IDL release. I've noticed more IDL crashes with the last release, but this is the only one that has an obvious cause.
>
>>
>
>>
>
>>
>
>> The issue with this widget is that it traverses a directory tree and builds a tree widget with the directories and any files matching a search string. This is being done with a recursive function that builds the tree nodes as it goes, which means *many* calls to findfile(). The only way file_search() would work is if I use it to return the entire directory structure, and parse the result to build the tree, which would mean a major rewrite. Being lazy, I was hoping there was a working equivalent to findfile(). Sigh.
>
>>
>
>>
>
>>
>
>>
>
>>
>
>>
>
>>
>
>> On Friday, July 26, 2013 9:34:19 AM UTC-6, David Fanning wrote:
>
>>
>
>>> wlandsman writes:
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> On Thursday, July 25, 2013 8:44:55 PM UTC-4, b_...@hotmail.com wrote:
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> > Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> Not a direct answer but I do notice on the Mac that file_search() is slow only on the first call:
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>>
>
>>
>
>>>> % Time elapsed: 41.371398 seconds.
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>>
>
>>
>
>>>> % Time elapsed: 0.45945001 seconds.
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> So file_search() was of order 100 times faster on the second call. This is similar to the Unix find command which stores the information of a search to speed up the processing on subsequent calls.
>
>>
>
>>>
>
>>
>
>>>> (I included /nosort because that is supposed to speed things up somewhat but it seemed to make little difference on the Mac).
>
>>
>
>>>
>
>>
>
>>>>
>
>>
>
>>>
>
>>
>
>>>> If your recursive search includes a lot of unnecessary directories, then it might be quicker to use a vector of plausible directories in you file_search() call, rather than searching every directory below the specified one. --Wayne
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> The speed up doesn't seem to be so pronounced on Windows:
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>>
>
>>
>
>>> Elapsed Time: 9.873000
>
>>
>
>>>
>
>>
>
>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>>
>
>>
>
>>> Elapsed Time: 5.622000
>
>>
>
>>>
>
>>
>
>>> IDL> tic & a = file_search('.','*.pro',/nosort) & toc
>
>>
>
>>>
>
>>
>
>>> Elapsed Time: 5.600000
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> This command found 5128 files.
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> Cheers,
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> David
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>>
>
>>
>
>>> --
>
>>
>
>>>
>
>>
>
>>> David Fanning, Ph.D.
>
>>
>
>>>
>
>>
>
>>> Fanning Software Consulting, Inc.
>
>>
>
>>>
>
>>
>
>>> Coyote's Guide to IDL Programming: http://www.idlcoyote.com/
>
>>
>
>>>
>
>>
>
>>> Sepore ma de ni thue. ("Perhaps thou speakest truth.")
Re: findfile gives 'Array has a corrupted descriptor' error [message #85401 is a reply to message #85335] Tue, 30 July 2013 10:16 Go to previous message
b_gom is currently offline  b_gom
Messages: 105
Registered: April 2003
Senior Member
Forgive the sin of continuously replying to my own post, but here is the workaround I've used. Spawning the 'dir' command takes much less time than file_search on network shares with many files:

IDL> tic & spawn, 'dir U:\somenetworkshare\* /b /aD',found,/hide & toc
% Time elapsed: 0.32800007 seconds.
IDL> tic & found=file_search('U:\somenetworkshare\*',count=count,/nosor t) & toc
% Time elapsed: 26.066000 seconds.

So, I wrote a wrapper for the file_search function that determines if the Windows OS is in use, and if the VM mode is not in use, and then does the following:

function listfiles,path,pattern,count=count,_extra=e
if n_elements(pattern) eq 0 then pattern='*'
if LMGR(/VM) then begin
return,file_search(path+pattern,/test_regular,count=count,_e xtra=e) ;forced to use slow version in VM mode.
endif
case strupcase(!version.os_family) of
'WINDOWS':begin
spawn, 'dir '+path+pattern+' /b /a-D /ON',result,/hide
count = (result[0] eq '') ? 0 : n_elements(result)
return,file_dirname(path+pattern,/mark)+result
end
'UNIX':begin
return,file_search(path+pattern,/test_regular,count=count,_e xtra=e)
end
endcase
end


P.S., I've also found that file_search is slow to return a large list of file matches (>~5000) from a given directory, but not when returning a short list of matches from the same directory. For example, in a directory of ~7000 files, file_search(path+'*') takes around 26 seconds, but file_search(path+'*.txt') returns in 0.3 seconds if there are only a few .txt files.
So, the above workaround actually costs a bit more time for cases where only small file lists are expected.


On Thursday, July 25, 2013 6:44:55 PM UTC-6, b_...@hotmail.com wrote:
> I running IDL 8.2.3 on Win7 64bit. I have an older program that uses findfile() to recursively find a set of filenames with a wildcard. I realize findfile is obsolete, but it runs *much* faster than file_search. When the findfile returns more than ~5000 files, however, I get the following error:
>
>
>
> found=file_search(uval.path+'*',count=count,/mark_dir)
>
> % Array has a corrupted descriptor: FOUND.
>
>
>
> Any ideas what is causing the error?
>
> Assuming that this is a bug that will not be fixed, does anyone have a fast alternative to file_search?
>
>
>
> Thanks
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Multiple plots with plot function
Next Topic: Ugly UNIX IDL Workbench

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 13:51:26 PDT 2025

Total time taken to generate the page: 0.00499 seconds