Re: Unicode text file read
markww wrote:
I'm trying to just read a unicode text file on a win xp system and
popup a message box with its contents:
std::wifstream wifile("C:\\unicode.txt");
if (wifile.is_open()) {
wchar_t wszBuffer[MAX_PATH];
while (wifile.getline(wszBuffer, MAX_PATH)) {
AfxMessageBox(wszBuffer);
}
}
But my message box just has strange characters in it (not the ones in
the unicode text file). How can we read these unicode files in?
Firstly, you need to realize that 'Unicode' is _not_ a fileformat, UCS2 or
UTF-8 are. In case you have questions, please visit the Unicode homepage,
it explains those formats.
Now, the second thing is that the behaviour of wchar_t streams is undefined
by default, meaning that there is no conversion you can rely on being
performed between the external and internal representation. This can be
easily be fixed by imbue()ing the stream with a locale with the appropriate
codecvt<>-facet. These however are not included in the C++ standard and I
don't think there is even a standardlibrary that ships any such facets, but
using google you will be able to find sources. Otherwise, Dinkumware, which
supply the standardlib for the VCs, has an addon library that includes
commonly used codecvt<>-facets.
Lastly, don't use the member getline() because it holds the danger of buffer
overflows and similar nasties. There is a free function called getline()
that takes a stream and a string, which does the same but safely.
Uli