Re: JNI Unicode String puzzle
On Tue, 18 Dec 2007 00:38:46 -0500, Roedy Green =
<see_website@mindprod.com.invalid> wrote:
On Tue, 18 Dec 2007 01:37:24 GMT, Roedy Green
<see_website@mindprod.com.invalid> wrote, quoted or indirectly quoted
someone who said :
If you do JNI GetStringChars in C++, just what do you get? an array
of TCHARS? A null terminated TCHAR string?
I'll assume you're writing for Windows.
Mystery solved.
GetStringChars (16-bit) does not terminate with null. You must use
wcsncpy_s to provide one.
GetStringUTFChars (8-bit) does terminate with null.
In one of my standard includes for my Windows JNI projects, I have a =
protype for the function:
LPWSTR GetSzwStringCharsFromHeap(JNIEnv * env, HANDLE hHeap, jstring jst=
r)
{
LPWSTR lpwResult=NULL;
jsize jStrLen;
if (jstr==NULL)
goto finished;
jStrLen=(*env)->GetStringLength(env, jstr);
lpwResult=HeapAlloc(hHeap, HEAP_ZERO_MEMORY, (jStrLen+1)*sizeof(WCHAR=
));
if (lpwResult==NULL)
{
fireJavaExceptionForSystemErrorCode(env, GetLastError());
goto finished;
}
(*env)->GetStringRegion(env, jstr, 0L, jStrLen, lpwResult);
=
finished:
return lpwResult;
}
(Callers should use (*env)->ExceptionCheck(env) to see if this function =
=
actually succeeded).
If there is a more conventional approach, I'ld love to hear it. Using =
GetStringRegion to copy data to the native buffer once seems like it =
should be more efficient than allocating a non-terminated buffer and a =
terminated buffer.
C++ Unicode 16-bit functions do not work (quietly degrade to 8-bit
mode) unless you define BOTH:
#define UNICODE
#define _UNICODE
I try to avoid using LPTSTR and TCHAR wherever possible, and instead fav=
or =
LPWSTR and WCHAR. Most Windows functions are declared as
#ifdef UNICODE
#define SomeFunction SomeFunctionW
#else
#define SomeFunction SomeFunctionA
#endif
(With the exception that functions new for Vista / Windows 2008 are =
generally UNICODE only)
Thus, I explicitly call SomeFunctionW, thus avoiding the compiler's glob=
al =
UNICODE definitions.
Isn't the UNICODE declaration supposed to be set by the C compiler's =
environment when it's in Unicode mode (which to me would suggest the =
compiler will compile "xyz" the same as L"xyz")? Since <jni.h> expects =
method & type signatures to be supplied as char* , it seems like switchi=
ng =
the compiler to the full-blown Unicode mode would then break when you =
attempt to make JNI calls of the form:
(*env)->FindClass(env, "java/lang/Object");
Anyway, as some of this is speculation and my experimentation with such =
=
settings is minimal, I'ld be curious how your mileage goes.
For what it's worth though, if you just use the "W" functions and avoid =
=
the TCHAR abstraction, the rest seems to fall into place.
I had forgotten what a nightmare C++ deeply nested typedefs with a
dozen aliases for every actual type are. YUCCH!
It came clear with sizeof dumps.
Hope that was interesting or useful,
-Zig