Re: Order of Variable

From:

Joshua Maurice <joshuamaurice@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Mon, 13 Sep 2010 11:56:48 -0700 (PDT)

Message-ID:

<89d0b532-deb4-41da-bae5-91ddb6361a34@q16g2000prf.googlegroups.com>

On Sep 11, 2:10 pm, "Alf P. Steinbach /Usenet" <alf.p.steinbach
+use...@gmail.com> wrote:

* Johannes Schaub (litb), on 11.09.2010 22:54:

The way I understand it, it does not matter whether the reference is by=

union or not. Aliasing an int as a float is UB no matter by an union or=

not.

I thought that was the whole point of Gabriel's issue report? I.e if th=

statements in foo are reordered, then "t.d" accesses an int object by a
double lvalue, violating aliasing.

Example to show what I mean:

union {
float f;
int i;
} u;

// now the object is a float, by 3.8/1.
u.f = 1.f;

// now it is an int, also by 3.8/1
u.i = 10;

// aliasing violation by 3.10/15 - UB, trying
// to access value of int object by float lvalue.
float f = u.f;

Yes. But if you consider

   u.i = 42;
   u.f = 2.71828;

   float f = u.f;

Then you have valid code.

And then the compiler can't reorder the two assignment statements.

It can't reorder the assignment statements even if they're placed in a fu=

nction

where it's not locally known that for a particular call the float and int=

are in

a union (Gabriel's point).

Indeed. This is a well known bug in the C and C++ specs. James talks
about it else-thread, quoted here:

On Sep 11, 11:56 am, James Kanze <james.ka...@gmail.com> wrote:

The issue is not simple, and formally g++ isn't conformant in
this respect (but it has nothing to do with =A79.2/17---I've not
verified, but I suspect that g++ does handle that correctly).
But if I understand correctly, the C committee thinks that it is
the standard which is broken, not g++---the standard guarantees
too much.

The exact issue was something like:

    union U { int a; double b; }

    int f(int* a, double *b)
    {
        // g++ reorders the following two statements...
        int result = *a;
        *b = 3.14159;
    }

    // ...
    U u;
    u.a = 42;
    int i = f(&u.a, &u.b);

Technically, the above code fulfills the requirements of the
standard; you're reading the last element written in the union,
then writing a different element. IIRC, the opinion of the C
committee was that this *should* only be guaranteed to work when
the actual access is through the union type.

Basically, the problem is that with unions, the strict aliasing rules
no longer "work". The solution that James says that the C committee is
for is not abandoning the strict aliasing rules, but instead greatly
restricting usages of unions. Basically, in current practice, if a
union has 2 different members, and if you let pointers to 2 or more
members escape the current scope, then you're boned. I'm not exactly
sure what formal rule the C committee is pondering, but at the very
least it would have to effectively include that restriction.

On Sep 11, 2:10 pm, "Alf P. Steinbach /Usenet" <alf.p.steinbach
+use...@gmail.com> wrote:

Now consider

    #include<stdio.h>

    struct A { double d; int x; };

    void foo( int* a, A* b )
    {
        b->x = 1;
        *a = 2;
    }

    int main()
    {
        A aha;
        foo(&aha.x,&aha );
        printf( "%d\n", aha.x );
    }

Since int* and A* are clearly pointers with different referent types, =

least the simplistic purported aliasing-based reordering rule mentione=

d so

far should allow reordering of the statements in foo, causing output 1
instead of 2.

I don't believe it -- although I may be forced to if James can h=

ark up

some reference (in which case the standard, IMHO, needs to be fixed).

I don't believe that the Standard allows reordering this either. b->x i=

s int

lvalue and *a is int lvalue too. Both are allowed to alias.

Yes.

And as I see it =A73.10/15 means that that's still the case when accessin=

g A::x as

B::x where B is layout-compatible with A (assuming same member names for =

this

example).

Then you would be alone on your interpretation. I don't have a C spec
handy, but for C++03, that is not the case.

This is what I can find on the subject in the standard itself.

C++03 standard, 9.2 Class members / 16

If a POD-union contains two or more POD-structs that share a common
initial sequence, and if the PODunion
object currently contains one of these POD-structs, it is permitted to
inspect the common initial part
of any of them. Two POD-structs share a common initial sequence if
corresponding members have layoutcompatible
types (and, for bit-fields, the same widths) for a sequence of one or
more initial members.
<<<<

The rule above only applies to PODs in a union.

C++03 standard, 9.2 Class members / 167

A pointer to a POD-struct object, suitably converted using a
reinterpret_cast, points to its initial
member (or if that member is a bit-field, then to the unit in which it
resides) and vice versa. [Note: There
might therefore be unnamed padding within a POD-struct object, but not
at its beginning, as necessary to
achieve appropriate alignment. ]
<<<<

The section above, especially with the (non-binding) note, pretty
clearly states that the C-style hack of inheritance may not work in C+
+. There might be unnamed padding which differs between different POD
structs.

Frankly though, this entire thing is a mess. When you compare the
guarantees of the two quotes, /which appear right next to each other
in the standard/, I don't understand how you can reconcile them in a
sane implementation. So, when the POD types are members of a union,
there's no difference in padding bits, but when the same POD types are
not members of a union, there might be extra magical padding bits.
What?

We expect that a compiler has a single rule for handling member
offsets so that accessing a member of an object is efficient, so it
doesn't matter if the POD object is a complete object or a member sub-
object of a union - the expected result is that the compiler will
generate the same assembly to access a member sub-object of the POD
object from a pointer to the POD object in all cases. With this in
mind, I have no clue how you're supposed to reconcile those two
sections above, one of which says there is no difference in padding
between lay-out compatible POD-struct types, and the next section
which says there might be a difference.

However, at face value, most / all of the gcc examples in this thread
have been conforming.

Otherwise there would have to be some rule about exactly which pointer re=

ferent

types are considered to be sufficiently different that the pointers can b=

assumed to be unaliased -- int != A?, A != B?, what?

Yes. That is exactly what "3.10 Lvalues and rvalues / 15" does. Hell,
there's a note on it which reads: "The intent of this list is to
specify those circumstances in which an object may or may not be
aliased."