Re: Read a file line by line and write each line to a file based on the 5th byte

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sat, 16 May 2009 01:46:44 -0700 (PDT)

Message-ID:

<f32cfd8f-3e4d-40ca-aeac-e409e34b9482@o30g2000vbc.googlegroups.com>

On May 15, 11:44 pm, "Alf P. Steinbach" <al...@start.no> wrote:

* James Kanze:

On May 15, 3:21 pm, "Alf P. Steinbach" <al...@start.no> wrote:

However, I think that when the intention is to guarantee
crash behavior then it shouldn't be guaranteed via some
implied usage of proper compiler option and restriction to
a compiler that supports it.

And further, the possibility of having [] yield crash
behavior through some compiler specific means is really not
an argument in favor of [], since at least in principle the
same can be done for 'at'. It's not like compilers are
somehow prevented from offering non-conforming features
such as crashing 'at', and it's not like one is relying on
the standard when advocating []: one is then relying on
very much in-practice tool usage, not guaranteed anywhere.
So just saying that [] is preferable to 'at' is misleading
because it compares a /customized/ version of [] to the
default 'at', while the proper comparision is IMHO between
either /default/ raw indexing [] versus 'at', where 'at'
wins handily wrt. to bug detection, or between /customized/
[] versus customized 'at', where there's no difference --
so, wrt. this, 'at' is either better, or the same.

A compiler cannot crash if there is a bounds error in at(),
because the standard says exactly what it should do.

Re-quoting myself from above, "It's not like compilers are
somehow prevented from offering non-conforming features".

Yes, but we're not supposed to talk about them here:-).
Seriously, if conformance isn't an issue, the compiler might not
have the at() function anyway. (I seem to recall some very
early implementations which didn't.)

Consider for example the MSVC treatment of a "throw()"
specification...

In short, when you're in compiler-specific land, that's where
you are at. :-)

There are almost certainly programs which depend on it,

For those programs don't ask the compiler to use non-compliant
crashing 'at'.

With separate programs, your example, it's simple.

However, with g++, how do you compile one part of the program
with checking behavior of [], when some other part is an
object file or lib compiled with non-checking []?

You don't. The results will core dump.

In general, you don't compile different parts of the program
with different options. Some options are harmless (e.g. warning
levels), but a number of them will break binary compatibility.
(Personally, I find this horrible, but that's the way it is.)

This is a rhetorical question. It's my intention that instead
of answering the question literally (involving source code
changes), you compare it to your own argument regarding 'at',
and note that for [] it's more serious.

I must be missing something, but I don't see how it's relevant
for either.

Hence, above consideration combined with the apparent complete
lack of compilers that implement the crashing 'at' :-), plus
the fact not all compilers in common use support a
range-checking [], e.g. note that MSVC 7.1 does not, my
suggestion of using an alternative notation, like some
indexing routine or macro.

Agreed. Fundamentally, you have the choice of using at(), and
getting a standard defined exception, or using [], doing your
own bounds checking beforehand, and getting anything you want.
When the exact exception that at() generates is the appropriate
behavior (which in my experience, is almost never the case), use
at(). In all other cases, use [] and your own checking.

(FWIW: in pre-standard days, when I was designing my own
containers, I actually started by designing a callback
mechanism. It aborted by default, but the user could set it to
do pretty much whatever he wanted: throw an application specific
exception, return a default value, etc. In the end, I dropped
it, because it made the interface too heavy---to be really
useful, you'd need different callbacks for different contexts.)

<example>
// This program intentionally has Undefined Behavior:
// arbitrary result or e.g. a crash.
#include <iostream>
#include <string>
#include <stddef.h> // ptrdiff_t
#include <assert.h>

typedef ptrdiff_t Size;
typedef ptrdiff_t Index;

template< typename C >
Size nElements( C const& c ) { return c.size(); }

template< typename C >
typename C::value_type& operator^( C& c, Index i )
{
     assert( 0 <= i && i < nElements( c ) );
     return c[i];
}

template< typename C >
typename C::value_type const& operator^( C const& c, Index i )
{
     assert( 0 <= i && i < nElements( c ) );
     return c[i];
}

int main()
{
     using namespace std;
     string const s = "Blah blah...";
     cout << "'" << (s^43) << "'" << endl;
}

</example>

Hm, I'd prefer @, as (I think) it is in Smalltalk, but no such
in C++...

I'd prefer making it a wrapper class, and using []. But the
basic idea is sound.

Also, the % operator has better precedence, but is less
mnemonic/readable. And there is the problem of a container
with a value-producing []. That's a thorny one, but it's late,
and I leave the thinking to you (there must surely be a
practical solution, if not TMP auto-magic then just
specialization).

But, anyway, for the novice I just recommend 'at', and I think
it's a disservice to them to recommend [] (even though it
might in practice be the better choice for the professional)
because it's tool specific and not necessarily available.

For the novice, I'd recommend getting a good implementation,
which crashes. Given that this is the case with the (free)
up-to-date implementations from Microsoft and g++, there's no
real reason not to *for learning*. (Professionally, we don't
always have a choice of compilers we're using. For learning,
I'd say that you should be using either g++ 4.0 up or the latest
VC++. Or Comeau, with the Dinkumware library, which is the same
as the one used in VC++.)

Not just for reasons of having a [] which crashes. (I'm less
sure about VC++, but pre-4.0 g++ didn't have fully standard name
look-up.)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34