Re: equivalent of realloc in C++

From:

Greg Herlihy <greghe@mac.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Mon, 30 Mar 2009 05:20:25 CST

Message-ID:

<ce992547-a38d-4a95-b2c1-86e9e2e3a90a@i28g2000prd.googlegroups.com>

On Mar 28, 7:17 pm, Andrei Alexandrescu
<SeeWebsiteForEm...@erdani.org> wrote:

There is a large amount of confirmation bias
(http://www.sciencedaily.com/articles/c/confirmation_bias.htm) in this
thread, and I've considered leaving the discussion alone more than once.

What you mistake for "confirmation bias" is simply healthy, scientific
skepticism. After all, the efficacy of realloc() as an optimization is
a question best answered by scientific - not statistical - methods. In
other words, realloc() will not become an optimization when the
percentage of people who believe that it is - crosses some particular
threshold.

As this thread demonstrates, there are many individuals who are
(justifiably) leery of benchmarks that purport to demonstrate
particular optimizations - but do so with any kind of supporting
theoretical framework to account for the alleged improvements. And
without such a theory, there is no way to quantify the magnitude of
the improvement expected (a quantification that then could be verified
by subsequent, independent tests.)

In this case, no one has enunciated any theory that would explain how
realloc() - which can add as much cost as it saves in individual cases
- would nonetheless save costs when applied across the board. Nor has
anyone quantified how much lower the cost of inserting an item into a
std::vector (which currently has a fixed, amortized cost) would become
- were a std::vector to adopt realloc().

Howard Hinnant was even wiser and just didn't intervene because he has
failed before with a formal proposal, so what's the chance of convincing
anyone on the Usenet? What seems to be happening is that people start
from the notion that in-place memory expansion would be unhelpful

Fortunately, scientific questions are not decided by majority vote.
The beauty (and power) of the scientific method is that a single
individual - equipped with a sound theory - is all that is needed to
change the majority of other people's minds.

Now, anyone who sets out to show that realloc() would provide a net
benefit does face a steep, uphill climb. For one, the cost of
inserting an item into a std::vector is currently a constant
(amortized) cost. Meaning that for each item inserted, a std::vector
will (eventually) have to perform at most, one copy operation.
Therefore, to do better than a std::vector, a realloc-ing vector would
have to guarantee at least a diminishing, amortized cost. That is,
each item insertion would have to result in an ever-decreasing number
of eventual copy operations; say, 1.0 for the first item inserted,
0.95 for the twentieth, 0.9 for the fortieth and so forth. The problem
here is that realloc()'s efficacy diminishes as the vector grows.
Intuitively, the average amount of free memory adjacent to an
allocated block decreases in relative size as the size of the
allocated block - increases. So the question to answer is: how will
realloc() be able to lower the costs of insertion continually - even
as the amount of memory it can expand into becomes on average, less
and less significant?

"because if it were someone would have introduced it in C++ by now" (a
form of trust in authority), and work back from there. Even after I and
others pointed out the falaciousness of such arguments, there was no
shortage of suppositions pulled out of thin air describing entirely
hypothetical scenarios, one less ingenious than the other.

Until someone describes the algorithm with which a std::vector would
use realloc() to improve its performance - there is nothing to test.

So I did what I probably should have done earlier - sat down and wrote
some code. This first little program only loads lines from a file:

#include <iostream>
#include <vector>
using namespace std;

int main()
{
     vector<string> vec;
     while ((vec.push_back(string()), getline(cin, vec.back())))
     {
     }
     vec.pop_back();

}

Then I wrote a program that does the same task, except it uses realloc
exclusively. I don't have access to an in-place expansion routine on
Linux (is there one?), so I wrote an implementation that is suboptimal -
it calls realloc *every single pass through the loop*. Pretty silly, eh?

Yes, it is silly to try to demonstrate than a resizing algorithm that
has a linear cost will somehow be cheaper than one with a fixed,
amortized cost - all else being equal (which is not the case here). In
fact, the only way that this program's realloc-ing vector could keep
pace with a std::vector would be if the cost of each individual "copy"
operation were much cheaper for the realloc-ing vector than they are
for the std::vector (and they are: block memory transfers are in
general vastly faster than per-object copying).

In contrast, if both programs were held to the same standard (that is,
if both vectors were required to preserve their contents after
resizing), then both implementations could use the same copy operation
and the comparison between the two would have at least some measure of
validity and usefulness.

#include <iostream>
#include <vector>
using namespace std;

int main()
{
     string * vec = 0;
     size_t vecSize = 0;
     for (;; ++vecSize)
     {
         vec = static_cast<string*>
             (realloc(vec, (vecSize + 1) * sizeof(string)));
         if (!vec) return 1;
         new(vec + vecSize) string;
         if (!getline(cin, vec[vecSize])) break;
     }

}

This code is not well-defined because at least in theory string could
hold internal pointers, so realloc is not guaranteed to work. On Gnu it
doesn't, so the code works for the sake of measurements of interest (not
that I'd recommend it).

The test reveals a 6.5% relative improvement in average run time of the
naive realloc-based implementation (I ignored the first run and averaged
over the next 5 runs, which yield 3.205s average runtime for realloc and
3.430 average runtime for vector). Both programs were compiled with -O3.

I don't know many programmers who would welcome a 6.5% speedup in
vector re-sizing if it were to leave the contents of the vector 100%
unusable. In fact, if this tradeoff is reasonable, then why does the
std::vector program perform any I/O at all? Certainly, filling the
vector with empty strings would speed up the program considerably. Why
is one program held to a much stricter standard than the other?

In reality, there should not be any surprise that a content-destroying
vector would outperform a content-preserving vector. The only surprise
is how small the difference is. What keeps the realloc-ing vector from
performing better are the larger number of extra "copy" operations
that it must perform - extra work that is slightly offset by the
relatively low cost of the block memory transfers themselves.. In any
event, this benchmark program does nothing to support the claim that
realloc() would provide any kind of benefit to a std::vector. About
the conclusion to be drawn here is that std::vector's performance
could be dramatically improved - if only the vector were not required
to preserve its contents after resizing.

To even the playing field and to make a somewhat reasonable
comparison, both programs must work correctly - and both must preserve
the contents of each container, intact. Replacing the std::string with
a POD-struct would help:

    #include <vector>
    using namespace std;

    struct S
    {
        char s[7];
    };

    int main()
    {
         S * vec = 0;
         size_t vecSize = 0;
         for (;; ++vecSize)
         {
              vec = static_cast<S*> (realloc(vec, (vecSize + 1)*
                          sizeof(S)));
              if (!vec) return 1;
              new(vec + vecSize) S;
              if (!cin.getline( &vec[vecSize].s[0],
                    sizeof(S().s), '\n') or !cin.gcount() )
                 break;
          }
     }

and:

    using namespace std;
     struct S
     {
         char s[7];
     };

     int main()
     {
          vector<S> vec;
          do
          {
                 vec.push_back(S());
          } while( cin.getline((char *) vec[vec.size()-1].s,
                   sizeof(S().s), '\n') or cin.gcount()) ;

          vec.pop_back();
     }

and the best timings that I measured for each:

     realloc:

     real 0m0.573s
     user 0m0.537s
     sys 0m0.033s

     std::vector:

     real 0m0.511s
     user 0m0.474s
     sys 0m0.035s

Greg

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]