Re: Unicode to characters
KK <pedagani@gmail.com> wrote:
There could be flavors of this question discussed in the past, but I
could not really make a head/tail out of it.
I have bunch of unicode values stored in a string array and I want to
see the corresponding characters displayed in an excel file. How could
I go about doing that ?
vector<string> unicodevalues; // has values 0041, 0042, ... 0410 etc.
(hexa decimal values)
for 0041 (assumes hex) I should see alphabet 'A' , a 'B' for 0042 ...
special character corresponding to 0x410.
I could live with a comma separated .csv file instead of a .xls to
view it in excel.
Please advice.
It's a pretty complex topic. There are about a half-dozen different ways
to represent unicode characters in a file (e.g.: UTF-8, UTF-16 both LE
and BE versions and others... see the wikipedia article on Unicode for
others.)
For what you want though, I think the best bet would be UTF-16LE with a
Byte Order Mark at the beginning.
Based on the description above, I'm assuming you have a vector of
strings where each string is the U+ value of a particular character. If
so, then you simply have to make a function that converts a string into
its UTF-16 byte equivalent. How you do that depends very much on what
kind of environment you are working in. Is it natively big endian or
small endian, for example?
Fundamentally, you have to convert your strings into an array of chars
that you can then send to a file stream.
void convert( const string& s, char* c )
{
c[0] = // the last byte
c[1] = // the first byte
}
char c[2];
for ( vector<string>::iterator it = myVec.begin();
it != myVec.end();
++it )
{
convert( *it, c );
myFile << c[0];
myFile << c[1];
}
I don't have time right now to go more into it, but if you respond, I
will add to the above.