Re: set constructor performance on Visual Studio 2003

From:

Francis Glassborow <francis.glassborow@btinternet.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Mon, 10 Dec 2007 09:56:46 CST

Message-ID:

<rLydnezfq-MSuMDanZ2dnUVZ8rGdnZ2d@bt.com>

Hans Bos wrote:

"Marco Manfredini" <ok_nospam_ok@phoyd.net> wrote in message
news:fj740l$cbl$1@aioe.org...

gast128@hotmail.com wrote:

Dear all,

from Austern's 'Generic Programming and the STL' (9.3.6), one can read
that the set range constructor perfroms NlogN, while if sorted it can
become N. However doing a test the opposite seems true: using sorted
input is always slower:

Set1 insert unsorted 3.297000
Set2 insert sorted 3.500000 //uses a sorted vector a sinput
Set3 insert sorted 1.765000 //uses insert hint
Set4 insert sorted 3.625000 //uses a sorted vector a sinput, but

...

You may consider varying n in your example and observe how the run
times for sorted and unsorted construction *change*. Also:

I created a small test program where the size of the sets varies.
For visual express 2005 in release mode (and _HAS_ITERATOR_DEBUGGING set
to 0),
I get the following results:
filling 10000 sets of 1000 elements ... done in 5.031
filling 1000 sets of 10000 elements ... done in 6.078
filling 100 sets of 100000 elements ... done in 13.451
filling 10 sets of 1000000 elements ... done in 26.965
filling 1 set of 10000000 elements ... done in 36.917

This suggests that insertion of n sorted elements is not lineair with this
implementation.
When I modify the program to insert the values one by one, I get the same
timing.
So it seems that this implementation does not conform to the standard.

Another implementation of set (compiled with the same options) gives:
filling 10000 sets of 1000 elements ... done in 4.889
filling 1000 sets of 10000 elements ... done in 4.874
filling 100 sets of 100000 elements ... done in 4.874
filling 10 sets of 1000000 elements ... done in 4.875
filling 1 set of 10000000 elements ... done in 4.875

Here the function I used to test:

void settest()
{
   const size_t total_elements = 10000000;

   //
   // initialize an array with numbers to insert into the sets
   std::vector<size_t> numbers;
   for (size_t i = 0; i < total_elements; i++) numbers.push_back(i);

   for (size_t size = 1000; size <= total_elements; size *= 10)
   {
       size_t n_sets = total_elements / size;
       std::vector< std::set<size_t> > sets(n_sets);

       std::cout << "filling " << n_sets
                << (n_sets == 1? " set ": " sets ") << " of " << size
                << " elements ..." << std::flush;
       clock_t before = clock();

       for (size_t i = 0; i < n_sets; i++)
       {
           std::set<size_t> new_set(&numbers[0], &numbers[0] + size);
           new_set.swap(sets[i]); // prevent destruction
       }

       clock_t after = clock();
       std::cout << " done in "
           << double(after - before) / CLOCKS_PER_SEC << std::endl;
   }
}

All the requirements in the Standard are about the number of 'calls'.
However there will be various discontinuities in performance as memory
requirements go up. When the entire process can fit in level 1 cache the
code will seem to fly. When the code requires repeated paging to backing
store you will wonder if the system is actually still working.

I expect at least two discontinuities in performance relating to where
the process has to switch from cache to main memory and from main memory
to backing store. In addition there is the issue of locality. If the
elements of a set are created at wildly different times the elements may
be sufficiently spread to induce paging even though there are not that many.

Note that the performance can also depend on what else is happening on
the test machine.

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]