Re: Several questions(tricky maybe) about sequence point&memory model and N2052.

From:

"kanze" <kanze@gabi-soft.fr>

Newsgroups:

comp.lang.c++.moderated

Date:

26 Sep 2006 06:46:47 -0400

Message-ID:

<1159260354.808851.40070@m7g2000cwm.googlegroups.com>

weapon.liu@gmail.com wrote:

1. When the standard says the order of evaluation of arguments
to a function is unspecified, does that mean they could be
overlapped? or just indeterminately ordered?

Can you write a legal program which can tell? The intent is to
allow total reordering of code between sequence points. This is
the motivation for not allowing more than one modification, and
not allowing the use of the modified object other than for
determining its new value. (The intent does recognize that
causation may impose some restrictions with regards to ordering.
Thus, in an expression like "x = x + 5", an implementation
cannot write x before having done the addition.)

There was some discussion at the time of the C standard as well
concerning parallelization, and I believe that the intent was
that the evaluation could be parallelized. It's less certain,
however, since any such discussion at the time would have been
purely theoretical. On the other hand, interleaving the
instructions for two different expressions, e.g. two arguments
to a function, was pretty much standard practice for most
compilers at the time.

This problem occurs to me when I was reading Boehm's memory
model proposal[N2052], in which he suggest using "sequenced
before" instead of "sequence point" and then stipulate the
evaluation order of function arguments as "unsequenced", which
explicitly allow overlapping according to his proposal.

So, my question is, are those two constraints(i.e.
"unsequenced evaluation order" and "unspecified evaluation
order") essentially equal? 'cause I was thinking that
"unsequenced" is looser than "unspecified evaluation order".

The current wording addresses the problem differently. The
abstract machine has a specific order (or several possible
specific orders). However, anything you can do to detect this
order either introduces a sequence point, or is illegal.

I couldn't find any statement about whether or not an
unspecified order allows overlapping in C++98. So, Could any
one give an explaination please?

The trick is to consider what is and is not undefined behavior.
(That's actually the trick for much of the C++ standard.)

P.S. According to another statement of the evaluation order
which says "there's an sequence point after the evaluation of
all the arguments and before the entry of the function", which
obviously says that the value computation and side effect of
the evaluations of the arguments could take place in any
order, even overlapped(?), because there's no sequence point
in-between them, right? so the implementation is free to
re-arrange the order of instructions.

Is free to and probably does.

Plus, according to my understanding of the sequence point, it
doesn't really matter whether or not the evaluations of
several subexpressions between two sequence points (may we now
say "not sequencely related"?) could be overlapped, because
it's just a specification of the semantecs of the abstract
machine. So even though the standard says that they should not
be overlapped, a particular implementation can make them
overlap anyway because "If a side effect on a scalar object is
not sequenced relative to either a different side effect on
the same scalar object, or a value computation using the value
of the same scalar object, the behavior is undefined.[quoted
from N2052]" which guarantees that overlapped evaluation of
the arguments doesn't really chang anything.

Yes. The point of N2052 isn't to change anything with regards
to current practice. The point is to extend current practice to
multithreaded processors, and to make the specification more
precise.

The sentence quoted above, according to my understanding, is
very important, it essentially says that we can view all the
value computations between two sequence points as operating on
immutable variables because otherwise the program will have
undefined behavior( a store operation on some variable used by
a value computation would break the rule). And now that they
operate on immutable variables, we can execute them
disorderly, unsequencedly, overlappingly, or whatever, right?

Exactly.

2. Another question of which I want ensure my judgement is right, the
following call has undefined behavior, right?
f(i,i++); // undefined behavior.

Certainly. The implementation could, for example, parallelize
the read of the first argument, and the write of the second.
Even in cases where simultaneously trying to write and read the
same memory cell would cause the processor to hang. (I don't
know of any processor where this would be the case, but I've
been told that they do exist.)

3. There's a confusing example in N2052, see below:
x++ + increment_x(); // Evaluation order unspecified; x may be
incremented only once

How could x be incremented only only while there's obviously
two "++".

Because of problems in the specification of the current
standard. This is nothing new. Sequence points only introduce
a partial ordering, and even today, that ordering doesn't
guarantee anything between what happens in increment_x and what
happens in the expression on the left side of the addition. A
compiler could, for example, generate code which reads x, then
calls increment_x, then increments the previously read value of
x, and stores the results.

This is currently the case. (And I'll bet you're not the only
person who didn't know it:-).) It's one of the problems with
the current specification that Hans and others are trying to
address. There's no undefined behavior, per se, and you can't
get a reformatted hard disk, or even a core dump. But you can
get some unexpected results.

The only way that can happen is one of the two "++"s is thrown
away.

Yep. That's effectively what happens. The increment in the
function is thrown away.

I guess what the example really wanna say is that "the value
of x may be x+1 instead of x+2", since the compiler would
rearrange the execution order like this:

#assuming x originally is 1
load x into some register say r1 ( here we have r1==1) // the first
1/3 of evaluating x++, the left operand of operator +
increment_x() // this makes the value of x 2, here we have completed
the evaluation of the right operand of operator +
store r1+1 to the memory location of x // this makes the value of x
r1+1, which is 2! not 3.
return r1 as the value computation of the left operand of "+" // we got
1
as a last step, the evaluation of "+" is done here, which gives a value
of 2, which perhaps lots of people assume not.

Given an initial value of x == 1, yes. Some people would be
surprised that the expression could evaluate to 2. Even more
would be surprised that x could have the value of 2 after
executing it.

While we are done, we can see that the value of x is 2, not 3!
While the exact words should be "x is incremented twice but the final
value of x is the value of original x plus 1 so that it's SEEMINGLY
incremented once", of course we could say x is incremented just once,
because the result equals x+1, but that would be confusing and
misleading, at least at first to me( coz I was all "how in the name of
god is that going to happen?" :-) ). So I suggest clarifying this
example by giving more detailed explanation, after all it's a pretty
important example, right?

It seems clear enough to me that he is talking about the
observable results. Still, the paper is still undergoing work;
it probably wouldn't hurt to change the wording so this is
clearer.

And since we are on that, there's a tiny typo in the definition of
increment_x():
int increment_x() { x++; } // here we dropped a "return".
Plus, should "x++ + increment_x(); " has undefined behavior according
to the rule: "If a side effect on a scalar object is not sequenced
relative to either a different side effect on the same scalar object,
or a value computation using the value of the same scalar object, the
behavior is undefined"?

That's really the status quo.

4. If C++98 standard means overlappable when it says unspecified
evaluation order, then why Boehm, in N1944, says below:

[quoted from N1944]
Not every side effect or evaluation can be authoritatively determined
to be either previous to or subsequent to a given sequence point. For
example, given

a = 0;
b = (a = 1, 2*a) + 3*a;

The evaluation of 3*a is not ordered with respect to the sequence point
introduced by the comma operator. It may be previous or subsequent
(NOTE HERE!!); the standard simply doesn't say. Therefore, that
evaluation may or may not be separated from the assignment by a
sequence point.
[quote end]

Why is it "It may be previous or subsequent"?

Because the specification in the current standard is based on a
sequential machine. And he is considering memory loads and
stores here as "single" operations, a single instruction.

The point being, of course, that one of the permitted orderings
in the abstract (sequential) machine introduces undefined
behavior, which means that we can no longer count on anything.
Including ordering.

Again, this is intentional in the current standard. The rules
were carefully designed so that 1) the language could be
specified in terms of an abstract sequential machine, and 2)
anything you could do to detect whether the machine was
sequential or not either requires sequence points or introduced
undefined behavior, so that in practice, implementations are not
required to be sequential except in so far as necessary to
respect the ordering imposed by sequence points.

If unspecified evaluation means "could overlap", then this
should really be "It may be previous or subsequent, or overlap
with the sequence point introduced by the comma op."

In a first approximization, he's talking about the abstract
machine, which is sequential. In a second, he's talking about
basic operations, like load and store, which resume in a single
machine instruction.

When I say "overlap with", I mean like:

load a into some register say r1
(a=1, 2*a)
plus r1 by 3
...

That means the evaluation of "3*a" is _splitted_ by the sequence point
introduced by the comma op.

The ordering I think he's talking about here is at a lower
level; do we read the a in 3*a before or after writing it in the
expression a=1?

Is this ordering allowable by the current standard(98)? If it is, then
would it conflict with 1.9p7:
....called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall
have taken place?

His point here is that the expression has undefined behavior.

5. from clause 5 paragraph 4:
"Between the previous and next sequence point a scalar object shall ...
... .The requirements of this paragraph shall be met for each allowable
ordering of the subexpressions of a full expression; otherwise the
behavior is undefined."

Here, what are these "allowable ordering of the subexpressions"?

Anything that doesn't violate causation, I would suppose. There
are a few exceptions, but the standard clearly says that "Except
where noted, the order of evaluation of operands of individual
operators and subexpressions of individual expressions, and the
order in which side effects take place, is unspecified." So
"except where noted", any ordering is allowable.

According to my recollection, in no place has the current
standard clearly stated which ordering is allowable and which
is not. "Unspecified" is not clear enough to me, especially
when it comes to whether or not it allows overlapping, which
kind of is the core of all the questions I brought up here.

At some point, the standard considers individual operations as
atomic, at least in the abstract machine. And the sentence I
quoted quite clearly says that the order is not specified,
unless otherwise stated. Just what don't you understand about
"unspecified".

6. If unsequenced evaluation(overlapped execution, in
particular) is allowed as said in N2052, would the evaluation
of some simple expression gives unpredicable value, which in
turn makes the behavior undefined? Here's an example:

double d = ... ;
double g(double& d){ return d*=3.12; }

f( ++d, g(d) ); // should this be undefined behavior?

It is, or at least, the results are unspecified. It suffers
from exactly the same problem as the example with x++ +
increment_x().

Now let's just say that we're on such an architecture that
storing a double consists of storing its lower and higher
DWORDs( if sizeof(double) equals that of a qword )
__separately__ . ('coz it seems I have qword-level store
instruction on my x86 pc) So, on such an architecture, the
instructions generated for the function call of 'f' above
would possibly be like this:

load d and compute d+1
store the lower dword of the result into the lower dword memory
location of d, which now makes d half-baked.
evaluate g(d), which will use the half-baked d to compute d*3.12, and
store that into d, making d into a unreasonable/ridiculous state
store the higher dword of the result computed beforehand by "++d" into
the high dword memory location of d, which makes the value of d even
more ridiculous.
call f

So, it seems this clearly should be categorized as undefined
behavior, isn't it?

IIRC, there was a DR on C90 for just this problem. The answer
was that functions are "uninterruptable"; that once the sequence
point on entering a function has occurred, the implementation
was not allowed to execute any other code outside of the
function before the sequence point on leaving the function
occurred.

Of course, that still doesn't help us much here, since exactly
the scenario you describe can occur on storing the results of
the d++: the compiler stores the first half, calls g (which
modifies both halves), and then stores the second half. Which
could easily result in a denormalized value on the processors
I'm familiar with, and potentially in a trapping representation
on some more exotic processor.

But the real question is, which rule in the standard state
clearly that this is undefined behavior, or instead, which
rule in N2052 did? Plus, if the behavior of this example is
undefined, then the "x++ + increment_x(); " example in N2052
would have undefined behavior, too, right?

In the standard, the undefined behavior results from the fact
that sequence points only introduce a partial ordering; the
sequence point on entering a function, for example, defines an
order with respect to the arguments of the function, but not
with respect to other parts of the expression.

7. quoted from N2052:
"Every evaluation in the calling function (including other function
calls) that is not otherwise specifically sequenced before or after the
execution of the body of the called function is indeterminately
sequenced with respect to the execution of the called function."

I think this is over-constrainted, because it will make "++d"
indeterminately sequenced with respect to "g(d)" in "f( ++d, g(d)
);"(see the example above), while actually they're supposed to be
unsequenced. ( if "f( ++d, g(d) );" is not clear enough, how about "++d
+ g(d);" ?)
But I'm not sure about this, correct me if I made any stupid mistake.

I think that that's the intent. Don't give any additional
guarantees that aren't already present. (One could easily argue
about the wisdom of this, but there doesn't seem to be any
consensus for increasing the guarantees.) If it's currently
undefined behavior, it will probably still be undefined in the
next version of the standard.

--
James Kanze GABI Software
Conseils en informatique orient?e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]