comp.lang.idl-pvwave archive: archive

Home » Public Forums » archive » unicode conversion

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

unicode conversion [message #78365]

Mon, 21 November 2011 07:28

greg.addr
Messages: 160
Registered: May 2007

Senior Member

Here's a question. I have some strings with non-ascii characters, e.g.

IDL> a="Кукушка"

If I convert these to bytes, I see they are multibyte encodings:

IDL> c=byte(a)
IDL> print,(c)
208 154 209 131 208 186 209 131 209 136 208 186 208 176

and I can happily convert those numbers back to the original chars...

IDL> print,string(c)
Кукушка

Ok, now I have the same string, encoded (I believe) in UTF-8, from an html page:

IDL> b=" Кукуш& #x43A;а "

With some string splitting, I can convert these to bytes...

IDL> print,ch
4 26 4 67 4 58 4 67 4 72 4 58 4 48

which unfortunately are not the same as those I had before. There's an almost simple relation, but I can't quite figure it out:

IDL> print,c-ch
204 128 205 64 204 128 205 64 205 64 204 128 204 128

I was hoping one of these transparently named things might do the job, but no luck so far:

I18N_MULTIBYTETOUTF8

I18N_MULTIBYTETOWIDECHAR

I18N_UTF8TOMULTIBYTE

I18N_WIDECHARTOMULTIBYTE

Anyone care to enlighten me?

Greg

Report message to a moderator

[Message index]

unicode conversion

By: greg.addr on Mon, 21 November 2011 07:28

Previous Topic:	Re: Oplot in multiple plots (extending over 2 pages)
Next Topic:	Re: unicode conversion

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Mon Dec 01 07:03:29 PST 2025

Total time taken to generate the page: 0.80396 seconds