Re: Need help with printing Unicode! (C++ on CentOS)
On Aug 28, 7:58 pm, Zerex71 <mfeher1...@gmail.com> wrote:
On Aug 28, 1:08 pm, Paavo Helde <pa...@nospam.please.ee> wrote:
Zerex71 <mfeher1...@gmail.com> kirjutas:
I'm sure this has been addressed before but I've hunted
all over the web and no one seems to provide a
comprehensive answer. I just want to do one thing: Under
CentOS, in a simple C++ program, I'd like to be able to
print Unicode characters to a console output. For
example, I'd like to print the musical flat, natural, and
sharp signs.
Here's what I've done so far:
1. Using Eclipse, created a small C++ console project.
2. Declare three chars, each of type wchar_t, and assigned them their
Unicode values (0x266d, 0x266e, 0x266f).
3. Attempted to print them out using wprintf().
4. Set my output console to a font which can represent the characters
(glyphs?) - Lucida Console
I am not not sure about CentOS, but in Linux generally UTF-8
is used. One should have an UTF8 locale (e.g.
LANG=en_US.utf8). If your code internally uses wchar_t, then
it should be converted to UTF-8 before output. I am not sure
if wprintf() or wcout() can do that automatically. In our
software we use UTF-8 and std::string internally, and it is
working perfectly in Linux.
Here's my locale setting:
(mfeher) mfeher-l4 [~] > locale
LANG=en_US.UTF-8
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C
I was under the impression that I had more of an "environment
setup" issue than a coding issue, i.e. I was unaware that I
had to do anything more to the code than change from
cout/printf to wprintf. Also, from a brief, brief reading of
all this material on the Internet, I don't want UTF-8 because
that's too small to hold the character codes I wish to print.
UTF-8, UTF-16 and UTF-32 are "transformation formats",
specifying how to "present" any Unicode (UCS-4) character as a
sequence of 8 bit bytes, 16 bit words, or 32 bit words. Since
all of the data interfaces under Unix are 8 bits, UTF-8 is the
transformation format you need.
Here's the code I am trying:
#include <iostream>
using namespace std;
int main() {
// cout << "Testing Unicode" << endl; // prints Testing Unicode
// If you try to mix Unicode printing with non-Unicode printing, =
the
switch
// causes you to lose output!
setlocale(LC_ALL, ""); // Does nothing
// Let's check our orientation...it never fails
if (fwide(stdout, 1) < 0)
{
cerr << "ERROR: Output not set to wide. Exiting..." << e=
ndl;
return -1;
}
// Declare a Unicode character and try to print it out
wchar_t mychar = 0x266d; // The music flat sign
wprintf(L"Here's mychar: %lc\n", mychar);
return 0;
}
That should work, unless the font doesn't have a rendering for
0x266D (the ones I have installed under Linux don't). This is
easily checked---try some more "usual" Unicode character, e.g.
0x00E9 (an =E9). If that displays, then the problem is almost
certainly that the font doesn't contain a rendering for the
character you want. In which case, there's no way you'll be
able to display it (other than by finding some font which does
support it, installing it and using it).
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34