Re: std::copy implementation standard conforming?

From:

"Andrei Alexandrescu (See Website For Email)" <SeeWebsiteForEmail@erdani.org>

Newsgroups:

comp.lang.c++.moderated

Date:

Mon, 25 Jun 2007 02:55:41 CST

Message-ID:

<JK5pC9.1upK@beaver.cs.washington.edu>

Greg Herlihy wrote:

On 6/23/07 3:06 AM, in article JK2sBF.AEt@beaver.cs.washington.edu, "Andrei
Alexandrescu (See Website For Email)" <SeeWebsiteForEmail@erdani.org> wrote:

Greg Herlihy wrote:

The entire reason for enabling "checked" routines in the first place - is to
find bugs (especially those that cause undefined behavior) in a C++ program.
Therefore, it makes little sense that once a checked routine does expose a
bug in the user program, to write code to bypass the checked routine - as if
concealing the bug somehow fixes it. (Moreover, checked routines are usually
enabled only in debugging builds anyway).

So the only sensible thing to do, in this situation (or any other situation
in which a program's undefined behavior has been made evident) - is not to
conceal or dispute the existence of the error - but to fix it.

But the entire point is that the code is not in error. The
implementation is in error (better said, overly conservative) by
assuming that the static type of the array faithfully represents its
dynamic length. This may not be the case in a variety of legal cases.
There is one that doesn't even have a cast in sight:

float a[5][2];
float (&b)[2] = a[0];
...

Copying into b would be legal for up to 10 floats, yet said
implementation would claim only 2 floats could be copied.

No. It is the user program - not the STL - that claims that "b" is a
reference to a two-float array.

This might be a misunderstanding of the subject of the debate. I'm not
discussing claims or intent, or whether code is of good or bad quality,
but instead what's legal and what's not by the letter of the standard.
So we can't just disagree and call it a day.

So let's consider the code:

float a[5][2];
float (&b)[2] = a[0];
float c[4] = { 0 };
std::copy(c, c + 4, b);

Let's figure whether std::copy can reject this code or not. To do so, we
go to std::copy and read there (section 25.2.1) that std::copy requires
b to be an OutputIterator.

Then we go to the requirements for OutputIterator (section 24.1.2, table
73) and notice that the type float[2] does not satisfy them. (For
example, it doesn't have a dereference operator.) So already the
implementation is in error. Looking for an appropriate OutputIterator,
float* comes as the only possible candidate, via the float[2] -> float*
implicit conversion.

So by the letter of the standard, by the time the arguments have reached
std::copy, the size information has been lost by necessity.

Now it would be a tad more tedious to prove that writing b[3] is legal
and same as a[1][0], but I'm sure it can be done. (I tried by quickly
overseeing section 3.7 to 3.9, without success.)

So, if the same program then proceeds to
copy ten floats into this supposedly two-float "b" array - then the
program's action is at odds with its own declaration. So there is a mistake
- either with the number of items being copied - or with the declared size
of b's array. And the purpose of a checked routine is simply to expose
contradictions like this one. In this example, the problem is fixed by
declaring "b" accurately:

float a[5][2];
float (&b)[10] = a;

Well, that would need a cast to compile.

Now, the merit of the implementation is that often code does keep the
static type of an object in sync with its dynamic extent. It's also true
that most often copying beyond the static length is an error. However,
staying with the letter of the law (and not its spirit), it can be said
that the STL implementation is in error and disallows code that is correct.

No, the "checked" STL routines seek to break incorrect code that happens to
work - while correct code is unaffected by their presence.

I understand what the checked STL routines are trying to do. All I'm
saying is that, in addition to incorrect code that they disallow, they
also disallow correct code. I agree that the correct code they disallow
is often of questionable quality, but it is 100% correct code by the
letter of the standard.

After all,
type-unsafe operations - such as copying ten items into a array declared to
hold two - might work - but are clearly not correct in terms of C++'s type
system.

Probably we use different meanings for the term "correct". By figuring
out if a piece of code is _correct_ within this discussion, my
definition is: go to the standard and figure if it defines the effects
of said code. It's as cut and dried as it gets.

If we define "correct" as "dynamic behavior agrees with the static
typing", then I agree that the invocation above is not correct. It just
turns out that static types don't tell everything about the definedness
a C++ program.

Bottom line: the STL implementation that disallows the code above
disallows correct code and is not conforming. However, I think it's good
to lose some degree of conformance for the sake of weeding out a
comparably large body of erroneous code.

Now, there might seem to be little point in fixing errors in code that
works. But even though such errors might not affect the behavior of the
current program - they do pose a longer term risk to both the
maintainability and the modularity of the program and its code.

I understand maintainability as difficulty to understand and properly
modify tricky code. But what does modularity have to do with all this?

Andrei

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]