Re: codecvt problem
Vaclav Haisman wrote:
Seungbeom Kim wrote, On 9.6.2011 15:02:
On 2011-06-08 16:24, Vaclav Haisman wrote:
Martin Bonner wrote, On 8.6.2011 10:06:
But memset produces undefined behaviour if mbstate_t is (for example)
void *.
I hope this is not the point of confusion here, but this statement is
actually wrong. You can memset() the pointer, but you can't use it
afterwards, see below...
I /think/ the only way to fix this is something like:
Why does it produce undefined behaviour? You can always take address of
a void pointer and zero the pointer.
That has never been guaranteed by either the C or the C++ standard:
the object representation that has all zero bits does not necessarily
yield the value representation of a null pointer.
First, I cannot imagine, with current interfaces and without extensions
etc., any implementation that would be usable, that would be defining
mbstate_t from a pointer type.
I could imagine it to point to a translation table that defines the current
mapping between internal and external encoding. I could also imagine it to
contain such a pointer and contain further state info, e.g. to create a
table-driven UTF-8 encoder/decoder.
Second, there is no UB in the sense the standard uses the term. Such
memset'd value of mbstate_t would be initialized, so no UB there, and
no sane implementation would dereference it (if it were at all possible
to use such kind of mbstate_t), no UB there either. Is there any other
situation that could be potential UB? I do not think so.
If you declare a pointer without initialising it or after destroying the
object it points to, merely reading the pointer yields undefined behaviour.
The thing is called "singular value", IIRC. Now, consider this case:
void* p;
memset(&p, 0, sizeof p);
If you do this, you are not guaranteed that the value of p is actually a
null pointer value. Someone mentioned a platform that uses 0xffffffff as
signal value for invalid pointers, because there is actually RAM at address
zero that can validly be addressed. Further, the standard doesn't require
the value to be "all bits zero". So, you could actually have an illegal,
singular value that can't even be read without causing UB.
And since std::mbstate_t needs to be compatible with C (and probably
POSIX too), memset() is about the only usable way of giving it some
sane and deterministic value.
According to my manpage of mbstate_t, you _must_ initialize it using
memset(). That means that any implementation defining it as a pointer must
not actually read it as a pointer, but only use access variants like
memcpy(), memcmp() etc. This means that all "clean" initialisations
mentioned in this thread (e.g. "mbstate_t s = {};" or "mbstate_t s =
mbstate_t();") are actually not guaranteed to work.
Uli
--
Domino Laser GmbH
Gesch??ftsf??hrer: Thorsten F??cking, Amtsgericht Hamburg HR B62 932
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]