I tend to save text out of application boundaries using Unicode UTF-8
(char's), ...

Why? [I am not criticising - just being curious!]

Reasons for me:
 - Editable with any editor provided it only uses ASCII, still mostly
editable if non-ASCII bytes occur and the editor assumes an 8-bit codepage.
 - Default encoding for XML, thus a de-facto standard for information
 - Using UTF-16 (MS Windows' internal representation) is not much easier,
because even there you need occasional surrogate pairs consisting of two
16-bit chars. Further, you need to convert anyway on other platforms.
 - Endianess detection is a non-issue.

Someone asked how UTF-8 is detected. In general, as with UTF-16, there isn't
any way to do it reliably. However, writing a BOM typically works
(explained on wikipedia page, btw), just like with UTF-16 where it signals
the endianess in addition.


