comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » file_search problem
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
file_search problem [message #92352] Wed, 25 November 2015 01:09 Go to next message
greg.addr is currently offline  greg.addr
Messages: 160
Registered: May 2007
Senior Member
I have a directory (on Windows7) containing some filenames with Romanian special characters, e.g.

ă.txt
în.txt


The OS 'dir' command shows them...

25/11/2015 09:41 <DIR> .
25/11/2015 09:41 <DIR> ..
25/11/2015 09:31 0 în.txt
25/11/2015 09:31 0 ă.txt
2 File(s) 0 bytes
2 Dir(s) 1,043,905,544,192 bytes free


file_search(path,"*.txt") returns the file with the 'î' but doesn't see the file with 'ă' at all.


cheers,
Greg
Re: file_search problem [message #92353 is a reply to message #92352] Wed, 25 November 2015 07:45 Go to previous messageGo to next message
Jim  Pendleton is currently offline  Jim Pendleton
Messages: 165
Registered: November 2011
Senior Member
On Wednesday, November 25, 2015 at 2:09:56 AM UTC-7, greg...@googlemail.com wrote:
> I have a directory (on Windows7) containing some filenames with Romanian special characters, e.g.
>
> ă.txt
> în.txt
>
>
> The OS 'dir' command shows them...
>
> 25/11/2015 09:41 <DIR> .
> 25/11/2015 09:41 <DIR> ..
> 25/11/2015 09:31 0 în.txt
> 25/11/2015 09:31 0 ă.txt
> 2 File(s) 0 bytes
> 2 Dir(s) 1,043,905,544,192 bytes free
>
>
> file_search(path,"*.txt") returns the file with the 'î' but doesn't see the file with 'ă' at all.
>
>
> cheers,
> Greg

A work-around is to use the old FINDFILE function, but you should report this to support at exelisvis.com.

A substantial amount of work was put into I18N a few releases ago, but it looks like this is a special case.

Interesting...

IDL> print, byte('î')
238
IDL> print, string(238b)
î

...however,

IDL> print, byte('ă')
97
IDL> print, string(97b)
a

Jim P.
Re: file_search problem [message #92354 is a reply to message #92353] Wed, 25 November 2015 07:58 Go to previous messageGo to next message
Lajos Foldy is currently offline  Lajos Foldy
Messages: 176
Registered: December 2011
Senior Member
On Wednesday, November 25, 2015 at 4:45:37 PM UTC+1, Jim P wrote:
>
> A substantial amount of work was put into I18N a few releases ago, but it looks like this is a special case.
>
> Interesting...
>
> IDL> print, byte('î')
> 238
> IDL> print, string(238b)
> î
>
> ...however,
>
> IDL> print, byte('ă')
> 97
> IDL> print, string(97b)
> a
>
> Jim P.

I think the first one is in extended ASCII (0-255) and the second one is a true Unicode character.

Are there any Unicode string support plans for IDL?

regards,
Lajos
Re: file_search problem [message #92355 is a reply to message #92354] Wed, 25 November 2015 09:05 Go to previous messageGo to next message
Heinz Stege is currently offline  Heinz Stege
Messages: 189
Registered: January 2003
Senior Member
On Wed, 25 Nov 2015 07:58:07 -0800 (PST), fawltylanguage@gmail.com
wrote:

> On Wednesday, November 25, 2015 at 4:45:37 PM UTC+1, Jim P wrote:
>>
>> A substantial amount of work was put into I18N a few releases ago, but it looks like this is a special case.
>>
>> Interesting...
>>
>> IDL> print, byte('î')
>> 238
>> IDL> print, string(238b)
>> î
>>
>> ...however,
>>
>> IDL> print, byte('a')
>> 97
>> IDL> print, string(97b)
>> a
>>
>> Jim P.
>
> I think the first one is in extended ASCII (0-255) and the second one is a true Unicode character.

Let me add, that this conversion seems to take place during string
input. No conversion happens, if the string "is really UTF-8":

IDL> a=['C3'xb,'AE'xb]
IDL> print,a
195 174
IDL> print,string(a)
î
IDL> print,byte(string(a))
195 174
IDL> b=['C4'xb,'83'xb]
IDL> print,string(b)
a
IDL> print,byte(string(b))
196 131

I hope, my news agent will choose the correct charset (UTF-8)! I'm not
sure.

Cheers, Heinz
Re: file_search problem [message #92356 is a reply to message #92355] Wed, 25 November 2015 12:38 Go to previous message
greg.addr is currently offline  greg.addr
Messages: 160
Registered: May 2007
Senior Member
On Wednesday, November 25, 2015 at 6:05:54 PM UTC+1, Heinz Stege wrote:
> On Wed, 25 Nov 2015 07:58:07 -0800 (PST), fawltylanguage
> wrote:
>
>> On Wednesday, November 25, 2015 at 4:45:37 PM UTC+1, Jim P wrote:
>>>
>>> A substantial amount of work was put into I18N a few releases ago, but it looks like this is a special case.
>>>
>>> Interesting...
>>>
>>> IDL> print, byte('î')
>>> 238
>>> IDL> print, string(238b)
>>> î
>>>
>>> ...however,
>>>
>>> IDL> print, byte('a')
>>> 97
>>> IDL> print, string(97b)
>>> a
>>>
>>> Jim P.
>>
>> I think the first one is in extended ASCII (0-255) and the second one is a true Unicode character.
>
> Let me add, that this conversion seems to take place during string
> input. No conversion happens, if the string "is really UTF-8":
>
> IDL> a=['C3'xb,'AE'xb]
> IDL> print,a
> 195 174
> IDL> print,string(a)
> î
> IDL> print,byte(string(a))
> 195 174
> IDL> b=['C4'xb,'83'xb]
> IDL> print,string(b)
> a
> IDL> print,byte(string(b))
> 196 131
>
> I hope, my news agent will choose the correct charset (UTF-8)! I'm not
> sure.
>
> Cheers, Heinz


Thanks, everyone, for the comments. I've found that the same does happen for other non-ascii characters (Polish, this time):


Directory of D:\tmp\test

25/11/2015 21:05 <DIR> .
25/11/2015 21:05 <DIR> ..
25/11/2015 09:31 0 în.txt
25/11/2015 09:31 0 ă.txt
25/11/2015 21:04 0 ą.txt
25/11/2015 21:04 0 ł.txt
4 File(s) 0 bytes


file_search gives:

IDL> file_search("d:\tmp\test\","*.*")
D:\tmp\test\în.txt

and findfile gives:

IDL> findfile("d:\tmp\test\*.txt")
d:\tmp\test\în.txt
d:\tmp\test\a.txt
d:\tmp\test\a.txt
d:\tmp\test\l.txt

So findfile sees the files, although the extended characters are simplified. However, the files are not identifiable through the simplified names:

IDL> a=findfile("d:\tmp\test\*.txt")
IDL> print,(file_info(a[3])).exists
0

This doesn't work either...

IDL> spawn,"dir /b /s d:\tmp\test\*.txt",res,err
IDL> res
d:\tmp\test\Œn.txt
d:\tmp\test\a.txt
d:\tmp\test\a.txt
d:\tmp\test\l.txt

And surprisingly (to me!), even this fails:

IDL> spawn,"dir /b /s *.txt >dir.txt",res,err

...with dir.txt containing the same

d:\tmp\test\Œn.txt
d:\tmp\test\a.txt
d:\tmp\test\a.txt
d:\tmp\test\l.txt

which is not the fault of IDL.

cheers,
Greg
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Compare non-linear function fit parameters of two data sets
Next Topic: EXTRACT_SLICE

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 13:27:53 PDT 2025

Total time taken to generate the page: 0.01220 seconds