UTF16-to-UTF8 conversion

mfc <mfcprog@googlemail.com>
Thu, 16 Sep 2010 13:30:21 -0700 (PDT)

maybe someone of you is using the following UTF16-to-.UTF8 conversion;

It`s not working properly:

CStringA test ("html");
userHTML = UTF8toUTF16((CStringA)test.GetString());

After that I will get something like "userHTML = "html=EF=B7=BD=EF=B7=BD=
a length of 12 and not 4....

-> here is the function:

static CStringW UTF8toUTF16(const CStringA& utf8)
  LPWSTR pszUtf16 = NULL;
  CStringW utf16("");

  if (utf8.IsEmpty())
    return utf16; //empty imput string

  size_t nLen8 = utf8.GetLength();
  size_t nLen16 = 0;

  if ((nLen16 = MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, NULL,
0)) == 0)
    return utf16; //conversion error!

  pszUtf16 = new wchar_t[nLen16];
  if (pszUtf16)
    wmemset (pszUtf16, 0x00, nLen16);

//here is the error located:
    MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, nLen16);
    utf16 = CStringW(pszUtf16);

//the length will be 12 instead of 4!!!! (for the CStringA "html")
  UINT length = utf16.GetLength();
  delete [] pszUtf16;
  return utf16; //utf16 encoded string

If I use
MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, (nLen16 -1));
instead of
MultiByteToWideChar (CP_UTF8, 0, utf8, nLen8, pszUtf16, nLen16);
and the CStringA test is a CString including a space at the end ("html
"); - the code is working as expected.

Maybe someone could give me a small explanation why the code is not
working with ("html")....

best regards

