Re: Is Chinese C++ SourceCode restricted to ASCII?
On Sep 5, 6:35 pm, Sam <s...@email-scan.com> wrote:
Peter Olcott writes:
If not can you provide a link to Chinese C++ SourceCode?
"Chinese C++ SourceCode" is a meaningless statement. I'm sure
there are many C++ applications that were written for use on
systems running in one of the Chinese locales. However there
will be nothing special about these applications' source code.
Maybe. Input encoding is implementation defined. It's quite
possible that different implementations support different
encodings, and even that the supported encodings depend on the
locale. Logically, one would like for UTF-8 to be the
"standard" encoding, but for the moment, that's wishful
thinking.
There's only one C++ language.
Which is only implemented by one compiler. Most of us have to
deal with compilers which are missing one or more features.
All keywords, classes, and variables, in the C++ language use
the Ascii character set.
Not at all. All of the keywords consist of lower case letters
from the basic character set, or underscore. For user defined
symbols, any character classified as alphanumeric in Unicode is
permissable. With two big hicks, however: first, an
implementation is not required to support characters outside the
basic character set (which can be defined in any encoding, e.g.
EBCDIC) in the source file, so the only officially portable way
to use anything outside the basic character set anywhere
(including in a string constant, or even in a comment) is by
means of a universal character name, which is very painful. And
secondly, this is one of those features which has been pretty
much ignored by most compilers---VC++ does support it, at least
partially (i.e. I've only tested accented characters from
French), and I suspect that Comeau supports it more or less
completely, but g++ is seriously broken in this respect.
In C++, wide character strings may contain characters outside
the Ascii character set.
C++ doesn't know anything about ASCII (or any other
encoding)---that's an implementation issue.
The encoding of wide character strings is implementation
defined, but is usually UTF-16.
The usually UTF-16 is also misinformation. It's true for
Windows (and maybe AIX), but not for any other environment I'm
aware of. (Solaris uses a much older, Unix specific encoding,
and Linux uses UTF-32.)
--
James Kanze