Re: What is the output of this program?

From:

James Kanze <kanze.james@neuf.fr>

Newsgroups:

comp.lang.c++.moderated

Date:

17 Jul 2006 15:39:55 -0400

Message-ID:

<e9e68h$50v$1@nntp.aioe.org>

Alf P. Steinbach wrote:

* James Kanze:

Alf P. Steinbach wrote:

* James Kanze:

Alf P. Steinbach wrote:

* kanze:

Earl Purple wrote:

Alf P. Steinbach wrote:

push_back can be a tad inefficient on strings, because
strings are usually short, and for short sequences the
reallocations, assuming scaling of buffer size, are more
frequent and thus more costly.

I don't think it's the allocations that make push_back
inefficient, (you can do reserve() ahead and std::string
will often allocate enough anyway). What makes it
inefficient is that push_back has to do a bounds-check on
every call (how else will it know whether to reallocate?).
Much more efficient is resize().

I doubt it. I also doubt that either make a significant
difference.

Possibly you're talking about something entirely different
than the following program?

[snip]

point where even if one version were ten times faster than the
other, I can't see it making significant difference in an actual
program.

Oh, come on. One order of magnitude is significant.

One order of magnitude in the execution time of the complete
application is significant. Whether it takes me a milliseconde
or 10 millisecondes to format a message that I output only a
couple of times a minute (or less) is not significant. And in
an awful lot of programs, the only thing you're doing with
std::string is formatting output messages (which aren't that
frequent) or parsing the configuration file (once on start up).

So the earlier assertion, "I also doubt that either make a
significant difference", was not meant to state that there was
no significant difference, but really that in some context of
your choosing, there would be no significant difference on
some measurement of your choosing.

The assertion was that the choice would make no significant
difference in terms of total run-time of the application. If
that wasn't evident, apparently I didn't express myself well.

That's meaningless.

Whether something makes a significant difference in the
perceived runtime of an application is NOT meaningless. Whether
something makes a difference as to whether the application meets
its performance requirements or not is NOT meaningless.

IMHO, they are, in fact, the only effects of runtime which are
meaningful.

One such context is to put the code in a function that's never
called, ever, where the measurement is anything else than
lines of code.

I maintain, as I wrote, that "push_back can be a tad
inefficient on strings".

Did you? What I read was: "Possibly you're talking about
something entirely different."

If all you are saying is that push_back can, in certain
implementations, be less efficient than resize, fine. I would
certainly agree that if you have a performance problem, and the
profiler shows that it is in push_back, replacing it with resize
is an option worth considering. Or vice versa, for that matter.
You're measurements suggest that if you do have a performance
problem in push_back, at least in some implementations, it might
be a significant win to change to resize. I couldn't reproduce
them, but that doesn't mean much; I probably did my tests on a
completely different architecture (Sun Sparc). (And of course,
I DID do the first, instinctive optimization that one would try;
I used reserve.)

And I submit my earlier little demonstration program as easily
reproducible evidence.

Evidence of what? That using resize instead of push_back will
make a significant difference in the runtime of my application.
You know better than that; you don't even know my application.

Also, when you can do something efficiently at about the same
cost in programmer time as inefficiently, it's not reasonable
to choose the inefficient way, then measure performance and so
on to determine whether it has to be changed to the efficient
way you could have used in the first place.

It's never about the same cost in programmer time. For whatever
reasons, push_back is the standard idiom for growing strings.
Using reserve up front, if you know the probable length and you
suspect that performance may be an issue (but I suspect that
reserve is used more often than necessary). It's more robust;
there's no risk of misguessing the final length. Anything else
is extra programmer effort.

If you just avoid the hugely inefficient operations, and also
refrain from quadratic time algorithms and the like (and
worse), then likely the application will be fast enough and
consume so little memory that the performance analysis &
associated detective work, changes & retesting becomes
unnecessary.

Refraining from quadradic algorithms when you know that the data
sets are going to be large is just good sense. Choosing one
standard function over another, even though it is less robust,
because it will probably gain you 20% (or more---but on my
systems, it's about 20%), in code that you don't know whether it
will be in the critical path or not, isn't.

--
James Kanze kanze.james@neuf.fr
Conseils en informatique orient?e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France +33 (0)1 30 23 00 34

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]