comp.lang.idl-pvwave archive: archive » byte/unicode mismatch

Home » Public Forums » archive » byte/unicode mismatch

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: byte/unicode mismatch [message #63999 is a reply to message #63933]

Tue, 25 November 2008 05:03

R.Bauer
Messages: 1424
Registered: November 1998

Senior Member

me has forwarded a feature request to creaso for an en/de- coding
parameter for open and had 5 minutes ago a phonecall about that. Lets see.

Reimar

Allan Whiteford schrieb:
> Reimar Bauer wrote:
>> Allan Whiteford schrieb:
>>> Reimar Bauer wrote:
>>>> That is all orthogonal.
>>>>
>>>> How can I decode and how can I encode?
>>>>
>>>> cheers
>>>> Reimar
>>>>
>>> Reimar,
>>>
>>> The question (and answer) isn't all that straightforward, byte values
>>> over 127 aren't well defined without an encoding system or a codepage.
>>>
>>> However, the answer you're probably looking for is:
>>>
>>> b=byte('ï¿½') ; assumption 2
>>> print,b[1]+(b[0] eq 195)*64 ; assumption 1
>>>
>>> which is assuming:
>>>
>>> 1) you want byte values from (two byte) UTF-8 to ISO-8859-1
>>>
>>> and
>>>
>>> 2) that the u-umlaut character has entered the intepreter from a UTF-8
>>> environment.
>>>
>>> Please don't just cut and paste the above assuming all will be well.
>>>
>>> Thanks,
>>>
>>> Allan
>>>
>>
>> Hmm this does confuse me more. Lets see if an other examples helps me.
>>
>> If I write an output file using the ide e.g.
>>
>> openw, 10, 'testfile.txt'
>> printf, 10, 'Jï¿½lich'
>> close, 10
>>
>> If I run this program with iso encoding isn't the result different to
>> utf-8?
>>
>
> Yes, copying and pasting that code into an IDL interpreter using a UTF-8
> environment/editor will give a different output file to using one
> without such awareness.
>
>> Or how can I write it iso encoded independent from the user setting?
>
> I would have said check to see if n_elements(byte("Jï¿½lich")) was the
> same as strlen("Jï¿½lich") to see if things were UTF-8 or not but it seems
> the IDL strlen function actually just counts bytes (I don't think it
> should do this).
>
> I'm not sure there is an elegant solution to this problem. In any case,
> I'm about to lose my free wi-fi.
>
> Thanks,
>
> Allan
>
>> In python I have several methods for that.
>> http://effbot.org/zone/unicode-objects.htm
>>
>> cheers
>> Reimar
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>

Report message to a moderator

[Message index]

		byte/unicode mismatch By: R.Bauer on Thu, 20 November 2008 02:19
		Re: byte/unicode mismatch By: Heinz Stege on Thu, 20 November 2008 11:08
		Re: byte/unicode mismatch By: Allan Whiteford on Fri, 21 November 2008 02:04
		Re: byte/unicode mismatch By: Michael Galloy on Thu, 20 November 2008 09:23
		Re: byte/unicode mismatch By: Allan Whiteford on Fri, 21 November 2008 09:51
		Re: byte/unicode mismatch By: R.Bauer on Fri, 21 November 2008 13:45
		Re: byte/unicode mismatch By: R.Bauer on Fri, 21 November 2008 02:10
		Re: byte/unicode mismatch By: Allan Whiteford on Mon, 24 November 2008 05:38
		Re: byte/unicode mismatch By: R.Bauer on Tue, 25 November 2008 05:03

Previous Topic:	maximum LUN
Next Topic:	Data organization question

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Wed Dec 03 10:24:43 PST 2025

Total time taken to generate the page: 0.64333 seconds