Re: Want Input boxes to accept unicode strings on Standard Window
"Tom Serface" <tom.nospam@camaswood.com> wrote in message
news:291E3653-F927-48C8-AA3A-3A42E0BAED0F@microsoft.com...
Hi David,
This works most of the time, but I've found with Asian languages there are
always some problems and the MFC libraries will still display in English.
But wouldn't the MFC libraries still display in English even in the UNICODE
build? Building in UNICODE doesn't fix that.... Actually, we've statically
linked to the MFC English version for years, and have never had an issue (at
least none have been reported), probably because no MFC UI is normally
displayed.
There is always the problem of ensuring that the code page on the user's
computer is correct as well (not just the developer's).
By this do you mean by setting the Regional Control Panel or
SetThreadLocale() appropriately? I did some tests the other day and saw
that MultiByteToWideChar(CP_ANSI, ...) converted a MBCS string to Unicode
differently based on the Regional Control Panel setting. My take was that
setting the Regional Control Panel altered CP_ANSI. I presume
SetThreadLocale() does the same thing, albeit only for the calling thread
and not on a system global basis.
If the user saves a file in Chinese (even if the code page is correct)
then accesses it in English the file will get "corrupted".
Yes, these were all very well known (and grudgingly accepted) problems in
the Win9x world where Unicode was not very well supported.
There will also be problems with translating strings like XML and other
things as well.
For XML, even if you have an Ansi (non-Unicode) XML file, if the first line
has at least
<?xml encoding="<insert encoding">
then IE displays the XML file correctly. (IE has become our default XML
viewer.) So the "encoding" attribute means a lot here. I still don't know
if saving the XML file in Unicode (with the 0xFFEE BOM) causes the text to
be displayed correctly regardless of the "encoding" attribute. Our little
XML parser does not read Unicode XML files, nor does it honor the "encoding"
attribute. Therefore, even though it returns LPWSTR strings, they have been
converted to Unicode strings based on the CP_ANSI codepage, and that (seems
to) require the Regional Control Panel to be set to the language that was
used to create the XML file. Do you know if MSXML or some of the "big boys"
or FirstObject parsers read Unicode files?
I worked through a lot of these issues then decided it was easier to just
go to Unicode for any application where I actually need multiple byte
characters (like Asian languages).
UNICODE builds make it easier to display Asian text, but our problem is how
to construct reliable LPWSTR from things like XML files.
In some cases it was not straightforward to port from Ansi to Unicode due to
the fact that code relies on single-byte character strings to perform their
functions. Things from driver-land which wouldn't know what to do with a
UNICODE string if we could even train device driver writers about UNICODE!
;)
I guess you could make it work so long as you always know the exact code
page for the strings, but this is always making an assumption.
Yes, and I'm not happy with that, but our scheme seems to have been
acceptable so far. Perhaps the results aren't so great, it's just that the
poor people affected by this are so used to it, they don't complain.
-- David