Re: Variables in for loop (style issue)

From:

Walter Bright <walter@digitalmars-nospamm.com>

Newsgroups:

comp.lang.c++.moderated

Date:

16 May 2006 06:02:56 -0400

Message-ID:

<yLWdnWHLS-uJSfXZnZ2dnUVZ_sSdnZ2d@comcast.com>

Andrei Alexandrescu (See Website For Email) wrote:

Walter Bright wrote:

The point I was trying to make is that const is of no help in doing loop
invariant optimizations. It is an important point, because many people
(even experts) have mistaken ideas about what const guarantees.

Const can be helpful _today_ wrt loop invariant optimization. If one writes:

const int limit = wuddever();
for (int i = 0; i != limit; ++i) {
...
}

the compiler is entirely and without exception entitled to treat "limit"
as a loop invariant. Any change of "limit" (via a const_cast) has
undefined behavior and as such the compiler doesn't need to cater for
(or worry about) it at all.

Not only is that true, but D does support const in that manner (i.e.
storage class const). But it's also true that data flow analysis is
pretty good at figuring out if limit can change within the loop even if
it isn't declared const. And what is useless (for determining loop
invariants) about const is when it is used as a type modifier, i.e. type
modifier const.

So far I've failed to do so, but I've been working to come up with a
better way to do something like const - and one that will enable an
advanced optimizer to take good advantage of. It's a difficult problem,
as there are so many competing interests. If you have some ideas along
those lines, I'd welcome your help.

We've talked about that in private a few times.

Yes, and these conversations have been most interesting and valuable to me.

Constness has properties
along different dimensions. Given an identifier, call it 'id', the
following axes could be const or changeable:

a) The binding between 'id' and what 'id' refers to. If you can change
that, 'id' has sort of Java reference semantics. If not, 'id' has C++
reference, lvalue, or const pointer (not pointer to const) semantics.
The latter is because once defined, 'id' can only refer to a specific
memory location.

So this is "constness of the binding".

b) Given 'id', can I change something reachable _directly_ by
dereferencing id (say id.x or id->x, depending on the language)? We'll
see in a minute why "directly" is important. If I can change id.x, then
I have C++ reference/lvalue/const pointer/nonconst pointer semantics. If
I cannot change id.x, then I have C++ const object/pointer to
const/reference to const semantics. So the property is whether or not I
can change the direct members of whatever 'id' refers to.

Ok, but I wish to point out that:
     int x;
     const int& end = x;
     int* p = &const_cast<int&>(end);
     *p = 3;
is legal, conforming C++ code. This makes a hash out of any useful
information that can be gleaned from current C++ reference to const
semantics.

This is "constness propagation along the direct dereference operation".

c) Given 'id', can I change something reachable _transitively_ by
dereferencing id (say id->x->y->z)? If I can, then I have regular C++
semantics. If I cannot, then I have "deep const" semantics that are
achievable in C++ by writing const accessors.

d) Given 'id', what guarantees are out there about other aliases of
what's transitively reachable from id? Aliasing is not semantically
covered in C++ and can't be simmulated without appealing to programmer
discipline.

These four degrees of freedom are independent. So their interplay
generates quite a few semantics, some of which are more interesting than
the others.

I agree, and I also want to add the property of whether a reference can
escape the current scope or not.

The most sophisticated combination that C++ can create is
with pointers:

const|nonconst T *const|nonconst id = initializer;

The second const (or its absence) controls rebinding (discussed in (a)
above), and the first controls direct field access (discussed in (b))
above.

I agree, except that (as I noted above) C++ semantics have botched the
utility of (b), so all we've got in C++ is (a).

C++ does not offer a "transitive const" operator to make const propagate
to cascading dereferences, and as such the qualifier lacks the so-called
closure property. If there were such an extra qualifier, we could write:

transitive const T *const|nonconst id = foo();

Then whatever universe I can reach by transitively dereferencing id (and
whatever that yields etc.), it's a read-only universe: "transitive
const" specifies an access gate that forbids any change.

To make sure I understand you, is this the "deep const" you referred to?

Could be as
well describe the only way we could travel into the past without causing
the grandfather paradox :o). Transitive const is a very powerful
guarantee, but not enough for certain uses (such as semantic guarantees
for an optimizer). This is because any parts of the object graph
accessible via id could be accessible via other aliases, as discussed in
(d). Still, transitive const is very powerful because it gives strong
guarantees (e.g., functions taking transitive const arguments can only
change the global state of a system, which can be confined arbitrarily).

Yes.

Now, as far as aliasing goes, things become considerably more
complicated, but a useful property is an all-or-nothing genuine
read-only property. Let's call it readonly. Then we have:

readonly T * id = initializer;

This means that practically whatever id refers to could be etched on a
steel plate because it's never gonna change. It's guaranteed statically
that no write access exists to whatever id points to. Finally, the ironclad:

transitive readonly T * id = initializer;

describes an entire object graph burned in ROM for eternity, having id
and possibly many other access points, all readonly as well.

So we wind up with:

const
readonly
transitive
noalias
unique
auto
volatile
mutable
final

and various combinations of them. The end result is, I fear, mass
confusion on the part of typical programmers instead of the clarity we
were aiming for.

Homework for anyone interested: figure out which automatic conversions
make sense among these qualifiers. Hint: it's not trivial.

Right, and that adds to the problem of understandability of the
language. That's where I get stuck. I'd like to find a away to make
these properties implicit and natural, rather than explicit. For
example, in Fortran, noalias for arrays is implicit, optimizers take
full advantage of it, and programmers have no problems with
understanding it. But when noalias was added to C, it was a disaster,
because essentially nobody outside the compiler implementors could use
it properly.

Even today, ask any C programmer what "restrict" means. You'll get a
glazed stare back. I've never seen anyone use it outside of a test suite
and the C99 standard.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]