Re: Is the aliasing rule symmetric?
Joshua Maurice <joshuamaurice@gmail.com> writes:
On Jan 24, 8:10??pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
<snip>
Have you got some reason to suspect that there is a problem with any of
these programs in C? ??The C standard seems quite clear on these specific
questions.
Yes. I've been getting various replies when I tweak the above program
just slightly.
#include <stdlib.h>
void foo(int* a, float* b)
{
*a = 1;
*b = 1;
}
int main()
{
void* p = malloc(sizeof(int) + sizeof(float));
foo((int*)p, (float*)p);
}
I asked if this had UB in comp.lang.c a while ago. I received various
replies, with little follow up discussion.
One reply was that a piece of memory may have at most one effective
type between calls to malloc and free.
That seems to me to be clearly false. Here is the wording:
"The effective type of an object for an access to its stored value is
the declared type of the object, if any[75]. If a value is stored into
an object having no declared type through an lvalue having a type that
is not a character type, then the type of the lvalue becomes the
effective type of the object for that access and for subsequent
accesses that do not modify the stored value. If a value is copied
into an object having no declared type using memcpy or memmove, or is
copied as an array of character type, then the effective type of the
modified object for that access and for subsequent accesses that do
not modify the value is the effective type of the object from which
the value is copied, if it has one. For all other accesses to an
object having no declared type, the effective type of the object is
simply the type of the lvalue used for the access."
Footnote 75 says: "Allocated objects have no declared type."
The object being stored into has no declared type. The *a = 1; makes
the effective type of the allocated space 'int' for that access and for
subsequent accesses that do not modify the object, but *b = 1; does
modify the object so, again, the effective types becomes that of the
lvalue expression used in the store: 'float'.
Another reply was that this is a DR in the C and C++ language specs,
known colloquially as the union DR.
There is no union so unless the DR covers more than just unions it won't
apply.
Another reply was that the above program has perfectly well defined
behavior, but the following has undefined behavior:
#include <stdlib.h>
int foo(int* a, float* b)
{
*a = 1;
*b = 1;
return *a;
}
int main()
{
void* p = malloc(sizeof(int) + sizeof(float));
foo((int*)p, (float*)p);
}
Yes, that's undefined. Sadly, I miss-read a similar example elsewhere
in this thread. After *b = 1; the effective type is float and the
access through an lvalue expression on type int is undefined.
Specifically, this example explains how the compiler might use
aliasing analysis for optimization purposes. A conforming compiler may
not simply assume that an int* and a float* do not alias. However, if
analysis shows that aliasing would result in UB (as it would in the
above program when "return *a;" reads a float object through an int
lvalue) then the compiler is free to do whatever it wants in the face
of the UB, including assume that they don't alias.
I think I like the third option best, but my personal preferences
don't dictate what compilers actually do.
No, nor mine, but that last explanation seems to me to be the correct
one.
An example that might distinguish between a compiler that assumes no
aliasing and one that knows the effective type rules would be this:
#include <stdlib.h>
#include <stdio.h>
int foo(int *a, float *b)
{
int x = *a;
*b = 1;
return x + *b;
}
int main(void)
{
void *p = malloc(sizeof(int) + sizeof(float));
*(int *)p = 1;
printf("%d\n", foo((int *)p, (float *)p));
printf("%f\n", *(float *)p);
return 0;
}
I think James posted a similar example elsewhere. In C this is
well-define and must print 2 and 1.000000 (or thereabouts). If a
compiler just assumes that 'a' and 'b' in foo can never point to the
same object, it might produce the wrong result (by, for example,
optimising 'x' away and using *a in the return).
It is possible that the C committee intended that the rules would allow
a compiler to assume that 'a' and 'b' don't alias, but that is certainly
not how I read the rules as they stand.
--
Ben.