Re: understanding strict aliasing

From:

Joshua Maurice <joshuamaurice@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Mon, 22 Nov 2010 15:57:43 CST

Message-ID:

<52edd1fb-c0f3-416e-86f8-15c78d46466a@j1g2000vbl.googlegroups.com>

Interesting. I looked for active issues on this topic, and I found:
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html
1116. Aliasing of union members

which allows for even less than what I suggested. It seems like a
reasonable fix though, breaking only really hacky pre-existing
programs. In short, once you allocate storage and put an object in
that storage, you may only reuse that storage to put a new object into
it iff it's legal to access the new object through an lvalue of the
original type.

At least, I think that's the intent. The wording is less than clear.
However, I don't think that's good enough. I think they're trying to
resolve the problem related to the union DR (mentioned in my first
post) by requiring that all aliases to a storage location remain
"valid" for the duration of the storage, aka until the storage is
released.

First, if that is the intent, they should clear up the wording so it
reads "You may reuse storage to put a new object into it iff the new
object can be accessed through all of the lvlalues of all of the
previous objects existing at that storage".

However, it's still not good enough. The following program
demonstrates the problem:

#include <new>
struct A { int x; };
struct B { double y; };
struct C : A, B { void* z; };
int main()
{
   char * p = new char[sizeof(A) + sizeof(B) + sizeof(C)];
   B* b = new(p) B;
   C* c = new(p) C;
}

Presumably we're fine up to and including "B* b = new(p) B". We can
access a "B" object through a char lvalue via the exception in 3.8 /
15, so it's fine under the proposed wording in "issue 1116. Aliasing
of union members".

Next we have "C* c = new(p) C;". Here is where things start getting
fun.

First off, is the requirement that "the 'C' object is accessible
through a char lvalue", or "the 'C' object is accessible through a 'B'
lvalue", or both? In this case, it is accessible through both (quote
unquote), so let's proceed.

Then have a deeper problem. With multiple inheritance and virtual
inheritance, pointer casts are no longer no-ops. That is, a
reinterpret_cast and an implicit cast can and will produce different
results. I would expect that code like above would crash horribly on
real implementations because the "B" sub-object in the "C" object is
not at "offset" 0. By the rules as written, it's fine because you can
access a "C" object through a "B" lvalue, but in practice (and perhaps
by standard?) that only works when the "B" lvalue is ... properly
"aligned(?)". This aspect is not captured at all in the proposed
wording as far as I can tell.

I'll possibly make a DR post to comp.std.c++ in a couple of days, in
which time I hope I get some feedback here.

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]