Re: Is the aliasing rule symmetric?

From:

Joshua Maurice <joshuamaurice@gmail.com>

Newsgroups:

comp.lang.c++,comp.lang.c,comp.std.c

Date:

Tue, 25 Jan 2011 03:17:56 -0800 (PST)

Message-ID:

<70256729-802e-4ec6-bc6b-a0ab512d1367@v31g2000pri.googlegroups.com>

On Jan 25, 2:22 am, James Kanze <james.ka...@gmail.com> wrote:

If you're actually trying to write an allocator, you also have
to take into account what actual compilers do, and not just the
standard. I seem to recall something along the following lines:

    float f(float const* in, bool* out)
    {
        float result = *in;
        *out = true;
        return result;
    }

failing with g++ when called with:

    union U { float f; bool b; };
    U u;
    u.f = 3.14159;
    float g = f(&u.f, &u.b);

According to both C and C++, the union guaranteed that this
should work, But g++ rearranged the read and the write in f.

(I also seem to recall---albeit vaguely---the C committee saying
that it wasn't the intent to make this work; that they only
meant for it to be guaranteed if e.g. f was passed a pointer to
the union. But it's all very vague---I didn't have time to
follow up at the time.)

At any rate, most of this discussion seems to turn around the
same issues, without the union.

Indeed. It's all very related to the union DR. So, the C standard
committee never intended for the following program to have defined
behavior? Interesting.

  void foo(int* x, float* y)
  { *x = 1;
    *y = 1;
  }
  int main()
  { union { int x; float y; } u;
    foo(&u.x, &u.y);
    return u.y;
  }

AFAIK, the only way that this could make sense is if you introduce
some formalisms with data dependency analysis.

Let me take another wack at trying to formalize it.

[quote]
For a single function, the compiler may assume that at any particular
point of execution, any accessible pointer value or named variable
does not alias another accessible pointer value or named variable of a
sufficiently different type (see existing strict aliasing rules),
unless the two pointers or named variables have a data dependency
between them (ala the rules for restrict, or maybe the C++0x rules for
std::memory_order_consume). Programs which violate this assumption
have undefined behavior.

Ex:
  void foo(int* x, float* y)
  { *x = 1;
    *y = 1;
  }
  int main()
  { union { int x; float y; } u;
    foo(&u.x, &u.y);
    return u.y;
  }
The function foo has a spot during its execution where there are two
accessible pointer values (its parameters x and y) of sufficiently
different types which alias, and there is no data dependency between
in the scope of the body of foo. Thus the assumption is violated, and
the program has undefined behavior.

Ex:
  #include <stdlib.h>
  int main()
  { int* x = (int*) malloc(sizeof(int));
    *x = 1;
    free(x);
    float* y = (float*) malloc(sizeof(float));
    *y = 1;
    free(y);
  }
In the above program, malloc may return the same piece of memory
twice, once for x, and once for y. However, at no point of execution
are both pointers "live" and pointing to the same piece of memory.
Thus the assumption is not violated, and this program has no undefined
behavior.

Ex:
  int main()
  { int x;
    float* y;

    y = (float*) x;
    x = 1;
    *y = 2;
    x = 3;
  }
The above program has a named variable which aliases a pointer value.
However, there exist a data dependency between them, so the program
has no undefined behavior.
[/quote]

I think this is the best I've gotten to formalizing the intent. I'm
deferring to the 'restrict' rules, mostly because I think they would
probably best capture all of the nuances which I need. Perhaps I could
instead use the C++0x data dependency rules ala
std::memory_order_consume. I'm not intimately familiar with those
either.

I'm not sure what I have written thus far is anywhere near sufficient
or correct, but hopefully it captures what I'm aiming for.

The important thing is, AFAIK, nothing like this is in any of the C
standards nor any of the C++ standards.