[SDL] The keyboard in SDL

Kuon - Nicolas Goy - 時期精霊 (Goyman.com SA) kuon at goyman.com
Fri Feb 2 01:10:57 PST 2007


On Feb 1, 2007, at 10:05 PM, Christer Sandberg wrote:

>

You should not confuse Unicode and UTF-8.

Unicode is NOT an encoding, it's a standard CHARACTER SET.

The encoding are, UTF-8, UTF-16 (little and big endian) and UTF-32  
(same thing as 16 for endian).

In SDL, the unicode value is UCS4. (I think so, not 100% sure about  
sdl, it's quite strange because UCS4 is 32 bit, and sdl return a  
16bit value, but I think it's UCS4 but they just drop the value if  
it's bigger than 65535. Sam, Ryan?)

Let me explain a bit:

UTF-32 is the UCS4 value encoded on 32 bit, simply a number, stored  
in little or big endian.

For example (I took a rare kanji because it has a huge value:) )

0x2F9F4 is the UCS4 value, and can be encoding as is on 32 bit.

Now, how can I encode this on 16 or 8 bit? It's impossible!
The answer is, surrogates. This barbaric words means a "prefix" to be  
used to inform the parser that our char is encoded on two unit. (a  
unit is 16 or 8 bit depending of the encoding, can be up to 4 units  
with utf8)

I will not enter into the details, but for example, the above char is:

0xD87E 0xDDF4 in UTF-16 and
0xF0 0xAF 0xA7 0xB4 in UTF-8.

In short (if you want to understand fully, read the doc), in utf-16  
case, d87e means, this char is on two units and the second unit is  
code table xxx. Same logic for UTF-8.

So, you can store any UCS4 data in any UTF encoding.



>
> Is there a function in SDL that performs a translation needed for  
> this, or
> does anyone know about some lib providing it, or some location  
> where I can
> find a translation table.



Now about this:

your answer is "man 3 iconv".

or:

iconv_t converter;
converter = iconv_open("UTF-8", "UTF-32"); // (or UTF-16 for SDL,  
again not sure about SDL behaviour, you can also append BE or LE  
after the encoding name for endianess)

char * myUtf16Buff = ...;
char * myUtf8Buff = ...;

iconv(converter, myUtf16Buff, lengthInByteOfMyUtf16Buff, myUtf8Buff,  
lengthInByteOfMyUtf8Buff);

iconv_close(converter);

Should do the job.


Recommended reading:
- http://unicode.org/unicode/faq/
- http://unicode.org/Public/BETA/CVTUTF-1-4/readme.txt with code  
here: http://unicode.org/Public/BETA/CVTUTF-1-4/ConvertUTF.c (and  
the .h, just list the directory)
- http://www.joconner.com/javai18n/articles/UTF8.html
- http://en.wikipedia.org/wiki/UTF-8

Best of luck

-- 
Kuon
Programmer and sysadmin.

"Computers should not stop working when the users' brain does."







More information about the SDL mailing list