Re: Why no tstring, tcerr, tostringstream, etc
On Nov 28, 12:05 am, Paavo Helde <pa...@nospam.please.ee> wrote:
"Thomas J. Gritzan" <phygon_antis...@gmx.de> kirjutas:
The easiest solution is to use Unicode (wchar_t) or
non-Unicode (char) exclusively. If a program should support
Unicode, use wchar_t, otherwide use char.
Or use char and UTF-8 exclusively, to support Unicode. UTF-8
seems to be the preferred encoding in network and XML world,
as well as on Linux desktops. The drawback is that on Windows,
you might have to translate to UTF-16 (that's what Windows is
using) and back often.
Unicode is the universally preferred external encoding: files,
network, etc. (although XML requires support for UTF-16 as
well). Within the program, there are legitimate arguments for
all three: UTF-8, UTF-16 and UTF-32.
Using wchar_t does not guarantee Unicode automatically, as it
might be mapped to UTF-16 on Windows and to UTF-32 on other
platforms, which are totally different things and need
different approach.
UTF-16 and UTF-32 are both Unicode, and for some (many?, most?)
applications, can be handled exactly the same. UTF-8 is also
Unicode, but depending on the application, may introduce
additional complexities. If you're only doing very simple
things, just copying the strings and comparing for equality, for
example, UTF-8 is no more work than the others. Interestingly
enough, if you're doing very complex things, like typography,
UTF-8 is also not significantly more difficult. Between the two
extremes, however, there are cases where UTF-16 or UTF-32 can
make life easier.
Of course, on a lot of systems, wchar_t isn't Unicode. If you
want to be sure of any one particular encoding, you'll have to
use a typedef, and write a lot of code yourself. I've not
looked at it lately, but at least in the past, the best library
around for Unicode was ICU, see http://icu-project.org/; this
uses UTF-16.
If you are coding for Windows only, then the wise thing (which
does not mean I'm actually recommending it!) would be to use
their TCHAR and friends, they most probably will remain
back-compatible to some extent.
The problem is that they promote a lie; the give the impression
that you can easily switch to and from Unicode, just by changing
a typedef.
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34