Re: portable Unicode programming.
Eugene Gershnik wrote:
On Jan 26, 3:07 pm, Pete Becker <p...@versatilecoding.com> wrote:
Eugene Gershnik wrote:
First of all on all C and C++ compilers on Windows wchar_t is used to
store UTF-16. Technically it doesn't have to be so but any compiler
vendor who did otherwise would be insane.
On the contrary: the compiler has no control over what I store in
objects of type wchar_t. wchar_t stores whatever I put into it, using
the mbtowc family of functions and a locale that defines the character
encodings to be used.
First, 'compiler' above stands for 'compiler + library' which I assumed
to be obvious from the part you snipped.
Yes, it was obvious, and that's how I used it, too. If you prefer, I'll
rephrase what I said: The compiler AND LIBRARY have no control over what
I store in objects of type wchar_t. wchar_t stores whatever I put into
it, using the mbtowc family of functions and a locale that defines the
character encodings to be used.
Second, whatever mbtowc and
locales do the results should better be compatible with the OS
interfaces. I am yet to see any non-trivial application for which it
would make sense to convert from custom wchar_t to WCHAR on every OS
call.
You certainly need to be compatible with the OS interfaces when you make
OS calls. Most programs also deal with their own data, and when they do
that they use an encoding that's appropriate for the source and
destination of that data. UTF-16 may or may not be appropriate.
Regardless, neither the compiler nor the library forces any programmer
to store UTF-16 data in wchar_t.
--
-- Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com)
Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." (www.petebecker.com/tr1book)
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]