Re: Guarantee of side-effect free assignment

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.std.c++

Date:

Fri, 12 Oct 2007 10:03:35 CST

Message-ID:

<1192182387.340238.112080@q5g2000prf.googlegroups.com>

On Oct 11, 4:49 pm, jdenn...@acm.org (James Dennett) wrote:

James Kanze wrote:

On Oct 9, 4:55 pm, jdenn...@acm.org (James Dennett) wrote:

James Kanze wrote:

[...]

The construction of the object is a side effect.

Can you justify that claim?

What else can it be?

Part of the evalation of the expression, used to determine
the result of the expression (as indeed it does).

That's not the usual definition. An expression has a value and
side effects. The value is what it returns (a pointer, in the
case of a new expression); the side effects are any changes in
the global program state (writes to memory, etc.). Thus, in an
expression like "i = 42", the value is 42 (converted to the type
of i, if necessary); the write to i itself is a side effect.

Not everything fits into this classification scheme. Calls
to functions are neither the value of the expression nor are
they side-effects (changes to global state, etc.), and yet
they are required to happen.

If the function has a return value, then that is the value of
the expression, and you have to call the function to get it. If
the function is void, then it has a side effect, and must be
called before the next sequence point.

They may, in turn, contain code which has side-effects, and
those side-effects must obey the language's sequencing rules
(in particular, they must appear complete before the function
returns).

Consider

int f() { throw 7; }

int a;
a = f();

In this case, the assignment doesn't happen; the expression
f() has no value.

The expression f() has a value of type int. The compiler is
required to call f() in order to obtain this value, since it is
used by the assignment. In this case, the assignment cannot
take place until f() returns, since the value is not available
until then.

What actually happens in f() is irrelevant to the analysis of
"a = f();".

And yet executing the code in f() is not
a side-effect. This is much like

p = new Type_With_Throwing_Constructor;

except that the function call is implicit in the latter case.

No it's not, since in the above case, the compiler knows the
"value" of the new expression before calling the constructor.
It's more like the i = ++ j example, where the compiler knows
the value to assign to i before writing it to j.

Can you cite anything from the standard to suggest that function
calls are defined to be "side effects"?

They aren't. Their results are values, which are used in the
expression.

Unless it's trivial (which doesn't really
concern us here), it writes to memory, etc. Those are side
effects; the "value" of an expression has no side effects.

Alf's example illustrates that calling the constructor is
needed in order to know whether the expression has a value.
The value can't be assigned from if it does not exist.

The "value" of a new expression is the pointer returned from the
allocator function.

No; it is the address of a newly created object, according
to the standard. If a constructor throws, there is no such
object, and the new expression does not have a value.

That's an argument I can follow. It would be nice if the
standard actually said so, of course. But I can't find it.

5.3.4/1 says "If the entity is a non-array object, the
new-expression returns [sic] a pointer to the object
created." That seems fair explicit to me. If no object
is created, there is not a result for the new-expression.

Or do you claim that an object is created even though
the constructor throws?

(The Standard is rather inconsistent in its use of the
term object; in some places it uses a C-like definition
as a region of storage, and in other places it reflects
the C++ object lifecycle's inclusion of constructors and
destructors.)

That's what's bothering me a bit. I rather think you're reading
more into that sentence than was intended. But as written, it
does go a long way in supporting your view. (Now if only it had
said "returns a pointer to the fully constructed object[...]",
there'd be no ambiguity possible.)

The compiler needs to know this in order to
call the constructor.

It is in every way like the expression ++i.

I've talked about the differences. Claiming that they don't
exist doesn't make it so.

There are certainly differences, but not from the standard point
of view.

Here we disagree (and I've pointed out the differences
which do exist from the standard point of view, and
tried to explain why they are indeed differences). I
think we may well be talking past each other, which is
a shame as I have great respect for your arguments.

I'll admit that the only difference I can see is with regards to
observable behavior. In the end, I'm very convinced that the
standards (both C and C++) allow side effects to be reordered in
the abstract machine, and not just as a result of the "as if"
rule. And that calling the constructor, per se, is a side
effect.

[...]

It doesn't need to return a value to affect the result of
evaluating an expression. There are plenty of ways in which
even void-returning function can affect the value of an
expression.

Of the expression in which they are called? Not without
additional sequence points.

Maybe not, but the point to which I was responding was the
apparent claim that the reason why a constructor cannot affect
the value of an expression is that it does not return a value.
That was false. (And, indeed, constructors do introduce
sequence points. We agree on that, I believe; we just don't
agree yet on whether C++2003 orders those sequence points
before the assignment.)

The call to the constructor is definitely a sequence point. As
is the return from the constructor.

Also, the effect of exceptions is perhaps a red herring. You
can get the same effect without exceptions. Just make the
pointer global, initialized with a null pointer, and have the
constructor look at it. Is the constructor guaranteed to see a
null pointer?

What I will say is that I think that the standard should
guarantee this. And that it is "expected" enough that any
implementation would be foolish to violate it, regardless of how
we interpret the standard. And that the final two sentences in
?5.3.4/1, which you site above, are very close to convincing
me---the value of the new expression is a pointer to the object,
and it is probably meant, to the fully constructed object.

[...]
You don't need the "as if" rule. The standard explicitly states
that "side effects" can take place in any order, not necessarily
the order in which the sub-expressions which cause them are
evaluated. And that applies to the abstract machine; the "as
if" rule is not necessary.

Such an extended interpretation of the freedom to rearrange
code would be most problematic; I can see why people are concerned,
if they think implementors would really do such things.

I actually think that it is problematic. In more cases than
just this. But the committee doesn't seem to share my concerns.

(I'd like to see all freedom to reorder removed from the
abstract machine.)

I'm ambivalent on that subject. On the one hand, it reduces
unpredictability and simplifies reasoning about programs.

Not just reasoning. It means that tests actually mean
something, at least with regards to the values with which you
tested. Undefined behavior, for any reason, is an anathma to
testing.

On the other hand, the clearest/simplest code usually
sidesteps the issue anyway (excepting some exception-related
issues), and changing this might break backwards compatibility
on some systems where non-portable code (or binary interfaces)
depend on existing order of evaluation.

The question is: are there currently implementations
guaranteeing any specific order today? If so, then we have to
take them into account. (Of course, the "guarantee" might be
implicit. And also of course, even if we do specify something
else, there's nothing to prevent such compilers from offering a
flag which generates the old order.)

The real question, of course, is whether calling the constructor
is a side effect. To be frank, I don't really see how it can be
considered anything else, given the usual meaning of side
effect. Could you elaborate why it isn't a "side effect".

I believe I've tried to do so: it is *impossible* to determine
the result of a new expression while ignorant of the body of
a constructor which is used by that new expression.

The result of a new expression is a pointer. You don't need the
constructor to get a valid pointer. (You can't do much with the
pointer until the constructor has run, but it is a valid
pointer.)

But it's not a pointer to 5.3.4/1's "object created" if
the constructor throws, unless you take the C-like view
of what object creation is. (I wouldn't be opposed to a
DR regarding the standards split personality on this
subject.)

I think that it may be worth it. Although perhaps adding the
words "fully constructed" before "object" would be a small
enough change, and considered consistent with the original
intent, to be taken as an editorial change. In which case, it's
up to Pete.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]