Re: character output
Anceschi Mauro wrote:
Anceschi Mauro wrote:
Hi!
I've a text file encoded in CP1252 that contain character like ? ? ?
?
I read the file in Java but when I output the character seems
corrupted?
It's seems like I can't match the exact unicode characters...
Please Suggestion
Solved!
The problem is that our customer gives me the files in unicode
standard,
There are many "Unicode" standards for encoding characters from the
Unicode character set into "text" files. For example, UFT-8, UTF-16 and
UTF-32. The latter two can have an optional Byte Order Mark (BOM).
Generally UTF-8 is in widespread use and, unless you have a reason not
to use it, you probably should use UTF-8 everywhere that you can.
once I converted the to ASCII everythings started to work
well!!!
ASCII does not contain the accented Latin characters you mentioned (? ?
? ?). It's hard to image this "working" well.
My question now become: how to convert a file in UNICODE->ASCII in
java?
In general, don't. Java handles UTF-8 properly when reading and writing
strings through InputStreamReader and OutputStreamWriter.
try {
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream("infilename"), "UTF8"));
String str = in.readLine();
} catch (UnsupportedEncodingException e) {
} catch (IOException e) {
}
....
try {
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("outfilename"), "UTF8"));
out.write(str);
out.close();
} catch (UnsupportedEncodingException e) {
} catch (IOException e) {
}