comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » Re: Unicode Question
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Re: Unicode Question [message #46872] Sat, 07 January 2006 21:40
mitch grunes is currently offline  mitch grunes
Messages: 6
Registered: November 1999
Junior Member
Oops. Based on

http://www.unicode.org/versions/Unicode4.0.0

There are more characters than 16 bits accounts for. Can everything I
said. Used to be true.
Re: Unicode Question [message #46876 is a reply to message #46872] Sat, 07 January 2006 14:57 Go to previous message
R.Bauer is currently offline  R.Bauer
Messages: 1424
Registered: November 1998
Senior Member
Hi all

I think it's time to add a feature request to rsi

in python it's done this way

import codecs
f = codecs.open("file.txt","rb","utf8").read()


Are there plans known about when utf-8 is added to idl?


cheers
Reimar

David Fanning wrote:

> grunes@yahoo.com writes:
>
>> First, you might look at
>>
>> http://www.unicode.org
>>
>> to see what unicode codes are.
>>
>> Don't forget that some people write the ASCII subset in 8 bits, others
>> include a null byte to make it 16.
>>
>> Open and read an 8 bit code to the file in the usual way:
>> a=string(0b) & b=a
>> openr,1,'yourfilename'
>> readu,1,a
>>
>> Then if a is 0, drop it. If not, and it is a legit ASCII char, rather
>> than one of the Unicode prefixes, as you can determine from its range,
>> the character is string(a). else
>> readu,1,b
>> and the character will be in string
>> string([a,b])
>> Of course, that string is two bytes long - which is right for Unicode.
>>
>> I haven't checked this out, as I don't have a licensed IDL where I am
>> now, but it should work.
>
> Yeah, that's kinda what I thought, too. But I'm not so sure
> it is as simple as this anymore. :-)
>
> But I am handicapped by not having the actual file, too.
> I really was just wondering if anyone had any experience
> with this. My suggestions are still resulting in a lot of
> *&%^$ type of nonsense.
>
> Cheers,
>
> David
>
Re: Unicode Question [message #46886 is a reply to message #46876] Fri, 06 January 2006 13:30 Go to previous message
David Fanning is currently offline  David Fanning
Messages: 11724
Registered: August 2001
Senior Member
grunes@yahoo.com writes:

> First, you might look at
>
> http://www.unicode.org
>
> to see what unicode codes are.
>
> Don't forget that some people write the ASCII subset in 8 bits, others
> include a null byte to make it 16.
>
> Open and read an 8 bit code to the file in the usual way:
> a=string(0b) & b=a
> openr,1,'yourfilename'
> readu,1,a
>
> Then if a is 0, drop it. If not, and it is a legit ASCII char, rather
> than one of the Unicode prefixes, as you can determine from its range,
> the character is string(a). else
> readu,1,b
> and the character will be in string
> string([a,b])
> Of course, that string is two bytes long - which is right for Unicode.
>
> I haven't checked this out, as I don't have a licensed IDL where I am
> now, but it should work.

Yeah, that's kinda what I thought, too. But I'm not so sure
it is as simple as this anymore. :-)

But I am handicapped by not having the actual file, too.
I really was just wondering if anyone had any experience
with this. My suggestions are still resulting in a lot of
*&%^$ type of nonsense.

Cheers,

David

--
David Fanning, Ph.D.
Fanning Software Consulting, Inc.
Coyote's Guide to IDL Programming: http://www.dfanning.com/
Re: Unicode Question [message #46887 is a reply to message #46886] Fri, 06 January 2006 13:16 Go to previous message
mitch grunes is currently offline  mitch grunes
Messages: 6
Registered: November 1999
Junior Member
First, you might look at

http://www.unicode.org

to see what unicode codes are.

Don't forget that some people write the ASCII subset in 8 bits, others
include a null byte to make it 16.

Open and read an 8 bit code to the file in the usual way:
a=string(0b) & b=a
openr,1,'yourfilename'
readu,1,a

Then if a is 0, drop it. If not, and it is a legit ASCII char, rather
than one of the Unicode prefixes, as you can determine from its range,
the character is string(a). else
readu,1,b
and the character will be in string
string([a,b])
Of course, that string is two bytes long - which is right for Unicode.

I haven't checked this out, as I don't have a licensed IDL where I am
now, but it should work. I'll let you figure out the prefix codes.
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: Problem Compiling and Using Functions
Next Topic: reading a ninary file

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Fri Oct 10 09:59:56 PDT 2025

Total time taken to generate the page: 0.56227 seconds