UTF16-to-UTF8 conversion
Hi,
maybe someone of you is using the following UTF16-to-.UTF8 conversion;
http://www.codeproject.com/KB/string/utfConvert.aspx
It`s not working properly:
CStringA test ("html");
userHTML = UTF8toUTF16((CStringA)test.GetString());
After that I will get something like "userHTML = "html=EF=B7=BD=EF=B7=BD=
=EA=AE=AB=EA=AE=AB=EA=AE=AB=EA=AE=AB=EF=BB=AE=EF=BB=AE" " with
a length of 12 and not 4....
-> here is the function:
static CStringW UTF8toUTF16(const CStringA& utf8)
{
LPWSTR pszUtf16 = NULL;
CStringW utf16("");
if (utf8.IsEmpty())
return utf16; //empty imput string
size_t nLen8 = utf8.GetLength();
size_t nLen16 = 0;
if ((nLen16 = MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, NULL,
0)) == 0)
return utf16; //conversion error!
pszUtf16 = new wchar_t[nLen16];
if (pszUtf16)
{
wmemset (pszUtf16, 0x00, nLen16);
//here is the error located:
MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, nLen16);
utf16 = CStringW(pszUtf16);
}
//the length will be 12 instead of 4!!!! (for the CStringA "html")
UINT length = utf16.GetLength();
delete [] pszUtf16;
return utf16; //utf16 encoded string
}
If I use
MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, (nLen16 -1));
instead of
MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, nLen16);
and the CStringA test is a CString including a space at the end ("html
"); - the code is working as expected.
Maybe someone could give me a small explanation why the code is not
working with ("html")....
best regards
Hans