Re: Character set
On Jun 22, 10:13 pm, Ron <ron.nata...@gmail.com> wrote:
On Jun 22, 3:36 pm, Amit Kumar <amitkumar.i...@gmail.com> wrote:
Stroustrup says: "A variable of type 'char' can hold a
character of the implementation's character set."
I have numerous doubts related to character sets.
Your doubts are well founded. The standard says the
implementation's basic character set which will depend on the
compiler and the operating system.
Yes and no. It says that the basic execution character set will
consist of exactly 100 characters, and it lists them. If
furthermore guarantees that they all have one byte
representations, and that they will be encoded with a positive
number when stored in a char (which means that if char is an 8
bit signed type, they will be in the range 0...127). It doesn't
say anything about the actual encoding, however (except that all
of the 100 characters must be distinct, and that '\0' must be
encoded 0); implementations have used EBCDIC, for example.
C and C++ are schizoid in their idea of what the meaning of
"char" is, unfortunately. It is both the smallest
addressable unit of storage as well as the container for a
character from the basic set. This means practically (and
this applies to *ALL* versions of any windows compiler I've
seen) the basic char set is still 8 bits no matter how much
you'd like it to be larger.
But why would you want it to be larger? All encodings I know
have an 8 bit (or less) encoding form, which can be used. And
when you need fixed length representations, that's what wchar_t
was conceived for (although for historical reasons, it also uses
multi-element encodings under Windows and AIX).
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34