Re: inconsistencies when compiling

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sun, 27 Jan 2008 04:30:42 -0800 (PST)

Message-ID:

<6aec2eb4-8ab4-4e50-b94e-3194bf4433ed@e23g2000prf.googlegroups.com>

Jerry Coffin wrote:

In article <2aa28c68-4c15-429f-83ba-8dc900a0c691
@f47g2000hsd.googlegroups.com>, james.kanze@gmail.com says...

[ ... ]

Do you know, I rarely use either. Once in a blue moon, I'll use
the first to file an std::vector< Line >, or some such, with the

operator for Line which does all the work, and I'll

occasionally use the second for a small set of elements. But
both cases are exceptional. For most simple cases, the input
is:

    while ( std::getline( in, line ) ) {
        ++ lineNo ;
        boost::smatch fields ;
        if ( ! boost::regex_match( line, fields, syntax ) ) {
            // error...
        } else {
            // process fields...
        }
    }

It looks to me like this could be turned into a form that would fit
something similar to the previous patterns quite easily:

struct matched_line {
// probably want a ctor that specifies the syntax and such...
    boost::smatch fields;
};

struct process_fields {
    void operator()(matched_line const &) {
    // ...
    }
};

std::istream &operator>>(std::istream &is, matched_line &m) {
    std::string line;
    std::getline(is, line);
    if (!boost::regex_match(line, fields, syntax)
        is.setstate(std::ios::failbit);
}

std::transform(std::istream_iterator<matched_line>(in),
    std::istream_iterator<matched_line>(),
    std::inserter(output_sink),
    process_fields());

If you're summarizing the input, and producing basically a single output
at the end, you generally want to use std::accumulate instead of
std::transform.

Certainly. You *can* hide all of the program in a few fancy >>
or << operators. Something like "aligneq" (see
http://kanze.james.neuf.fr/code-en.html, then navigate through
the sources in the Exec branch), where main() basically just
does:

    std::vector< Line > lines ;
    std::copy( std::istream_iterator< Line >( source ),
               std::istream_iterator< Line >(),
               std::back_inserter( lines ) ) ;
    std::copy( lines.begin(), lines.end(),
               std::ostream_iterator< Line >( std::cout ) ) ;

once it's parsed the options. (In this case, it's definitly
necessary to save all of the data in a vector, since the output
formatting depends on data collected over the input
data---things like the length of each field.)

One thing at a time, though. And except for special cases, I'm
not sure that this isn't obfuscation. To tell the truth, I'm
not even sure that it isn't obfuscation here. But it was fun to
write, and I've found it quite easy to modify, adding additional
options as time goes on. But I'm not sure that it's a good
general solution---it works well here because the output is a
direct line by line mapping of the input. (And even here, the
Line class collects a lot of additional data during input, which
is used in output.)

or:

    while ( std::getline( in, line ) ) {
        std::istringstream data( line ) ;
        data >> field1 ... >> fieldLast >> std::ws ;
        if ( ! data || data.get() != EOF ) {
            // error...
        } else {
            // process fields ...
        }
    }

This looks much like the previous example to me

It is.

-- you seem to be
reading fields that make up a record, so I'd make that explicit:

struct record {
    type1 field1;
    // ...
    typeLast fieldLast;
// ...
};

std::istream &operator>>(std::istream &is, record & r) {
    std::string line;

    std::getline(line, is);
    std::istringstream data(line);
    data >> field1 >> /* field2, ... */ fieldLast >> std::ws;
    // check for errors, etc.
}

Then the processing turns into something like:

std::transform(std::istream_iterator<record>(some_istream),
    std::istream_iterator<record>(),
    std::inserter(results),
    process_fields());

As above, if you're (primarily) producing a summary of the
data, std::accumulate can make more sense than std::transform.

While it's true that these will often (usually?) be at least
marginally larger in terms of lines of code, it's also true
that most of the extra code is more or less boilerplate --
extra class and function definition "stuff" surrounding
roughly the same bits of real code.

I think it depends somewhat on the context. If it makes sense
for the parsed data to be a single class, then I'll go this way;
if it doesn't, then I probably won't. The choice of whether
there is a ParsedData class or not is made at a higher level,
according to the design of the application, and I rather think
introducing it only to be able to use istream_iterator is a bit
of obfuscation.

I think it depends somewhat on the context. If it makes sense
for the parsed data to be a single class, then I'll go this way;
if it doesn't, then I probably won't. The choice of whether
there is a ParsedData class or not is made at a higher level,
according to the design of the application, and I rather think
introducing it only to be able to use istream_iterator is a bit
of obfuscation.

My own experience with this has been quite positive -- typing in the
code doesn't take significantly longer, and at least for me, debugging
is reduced substantially. Likewise, I find that the separation of
responisibility makes incomplete analysis more apparent. When the code
does need modification, the separation of responsibility generally makes
it much easier to find what to modify where, and the defined interfaces
between the pieces make it easier to isolate the modification what's
really desired.

My own experience has been varied, as I said. Often,
introducing something along the lines of a ParsedLine class is a
good idea, and in such cases, using istream_iterator is more or
less natural. In other cases, it's more a case of forcing the
design to fit a pre-conceived implementation technique, which
generally isn't a good idea.

It is, at any rate, useful enough that it should be presented in
any presentation of iostream. But I'd still present the basic
loop construct first.

[ ... ]

But I was actually suggesting a description at a slightly lower
level, the basic loop, for example, but not what you do in it.

Well, I'll openly admit I was partly being humorous.
Nonetheless, there was a serious point: much of the time, you
can use a standard algorithm, so you don't need to write a
loop at all.

At any rate, it caused me to reflect some. To the point where
if I ever do find the time to write up such an article, I'll
certainly mention this possibility.