Re: how do I expand a unicode string to its visual UTF8
representation?
On Fri, 07 Aug 2009 00:44:50 -0700, Andrew wrote:
Indeed, this is what I suspected and this is part of my point. Whatever
solution I wind up with it needs to be platform-independent.
Do you need to convert it to 7-bit ASCII? If you can use 8-bit chars,
converting to UTF-8 is probably easiest - you can use
byte[] bytes = str.getBytes("UTF-8");
to get the UTF-8 data to put in your database, and then use
String str = new String(bytes, "UTF-8");
to convert the data you get from the database back into a Unicode string.
If you do decide to use the \u escape codes, you could do:
public class UnicodeTest {
public UnicodeTest() {
}
public void doit() {
StringBuilder builder = new StringBuilder();
builder.append("Copyright \u00A9 2009\n"); builder.append("Here is
the phrase (in Icelandic): I can eat glass and it doesn't hurt me\n");
builder.append("\u00C9g get eti\u00F0 gler \u00E1n \u00FEess a\u00F0
mei\u00F0a mig");
String str = builder.toString();
System.out.println(str);
for (char c : str.toCharArray() ) {
if ((c & 0xFF80) == 0)
System.out.print(c);
else
System.out.printf("\\u%04X",(short)c);
}
}
public static void main(String[] args) {
UnicodeTest test = new UnicodeTest();
test.doit();
}
}
Writing a function to convert from the \u escape codes to a unicode
string is left as an exercise for the reader (this would be an advantage
of using UTF-8 - you don't have to write your own decoder).