comp.lang.idl-pvwave archive
Messages from Usenet group comp.lang.idl-pvwave, compiled by Paulo Penteado

Home » Public Forums » archive » byte/unicode mismatch
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Return to the default flat view Create a new topic Submit Reply
Re: byte/unicode mismatch [message #63933 is a reply to message #63772] Mon, 24 November 2008 05:38 Go to previous messageGo to previous message
Allan Whiteford is currently offline  Allan Whiteford
Messages: 117
Registered: June 2006
Senior Member
Reimar Bauer wrote:
> Allan Whiteford schrieb:
>> Reimar Bauer wrote:
>>> That is all orthogonal.
>>>
>>> How can I decode and how can I encode?
>>>
>>> cheers
>>> Reimar
>>>
>> Reimar,
>>
>> The question (and answer) isn't all that straightforward, byte values
>> over 127 aren't well defined without an encoding system or a codepage.
>>
>> However, the answer you're probably looking for is:
>>
>> b=byte('�') ; assumption 2
>> print,b[1]+(b[0] eq 195)*64 ; assumption 1
>>
>> which is assuming:
>>
>> 1) you want byte values from (two byte) UTF-8 to ISO-8859-1
>>
>> and
>>
>> 2) that the u-umlaut character has entered the intepreter from a UTF-8
>> environment.
>>
>> Please don't just cut and paste the above assuming all will be well.
>>
>> Thanks,
>>
>> Allan
>>
>
> Hmm this does confuse me more. Lets see if an other examples helps me.
>
> If I write an output file using the ide e.g.
>
> openw, 10, 'testfile.txt'
> printf, 10, 'J�lich'
> close, 10
>
> If I run this program with iso encoding isn't the result different to utf-8?
>

Yes, copying and pasting that code into an IDL interpreter using a UTF-8
environment/editor will give a different output file to using one
without such awareness.

> Or how can I write it iso encoded independent from the user setting?

I would have said check to see if n_elements(byte("J�lich")) was the
same as strlen("J�lich") to see if things were UTF-8 or not but it seems
the IDL strlen function actually just counts bytes (I don't think it
should do this).

I'm not sure there is an elegant solution to this problem. In any case,
I'm about to lose my free wi-fi.

Thanks,

Allan

> In python I have several methods for that.
> http://effbot.org/zone/unicode-objects.htm
>
> cheers
> Reimar
>
>
>
>
>
>
>
>
>
>
>
>
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: maximum LUN
Next Topic: Data organization question

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Wed Oct 08 19:30:55 PDT 2025

Total time taken to generate the page: 0.00375 seconds