[SDL] SDL_iconv UCS-4 endianness

list at akfoerster.de list at akfoerster.de
Tue Oct 9 08:14:54 PDT 2007


Am Monday, dem 08. Oct 2007 schrieb Christian Walther:

> > If you want the "native" encoding for GNU libc, 
> > use the encoding name "WCHAR_T".
> > (I still think it would be a good idea to define "WCHAR_T" also in SDL)
> 
> Is that guaranteed to be unicode? I seem to remember something about 
> "locale-dependent encoding" together with wchar_t, but I may be wrong.

Okay, I had to look it up again.
In general it is not guaranteed. But for new glibc implementations it 
is.

>From the glibc documentation:
| But for GNU systems wchar_t is always 32 bits wide and, therefore, 
| capable of representing all UCS-4 values and, therefore, covering all 
| of ISO 10646. Some Unix systems define wchar_t as a 16-bit type and 
| thereby follow Unicode very strictly. This definition is perfectly 
| fine with the standard, but it also means that to represent all 
| characters from Unicode and ISO 10646 one has to use UTF-16 surrogate 
| characters, which is in fact a multi-wide-character encoding. But 
| resorting to multi-wide-character encoding contradicts the purpose of 
| the wchar_t type.
[...]
| We have said above that the natural choice is using Unicode or ISO 
| 10646. This is not required, but at least encouraged, by the ISO C 
| standard. The standard defines at least a macro __STDC_ISO_10646__ 
| that is only defined on systems where the wchar_t type encodes ISO 10646 
| characters. If this symbol is not defined one should avoid making 
| assumptions about the wide character representation. If the programmer 
| uses only the functions provided by the C library to handle wide 
| character strings there should be no compatibility problems with other 
| systems. 
http://www.gnu.org/software/libc/manual/html_node/Extended-Char-Intro.html#Extended-Char-Intro

> Also, isn't wchar_t 16 bits on Windows (UTF-16 recently, UCS-2 earlier)? 

As far as I know, yes.

-- 
AKFoerster


More information about the SDL mailing list