"Marco Manfredini" <ok_nospam_ok@phoyd.net> wrote in message
news:fj740l$cbl$1@aioe.org...
gast128@hotmail.com wrote:
Dear all,
from Austern's 'Generic Programming and the STL' (9.3.6), one can read
that the set range constructor perfroms NlogN, while if sorted it can
become N. However doing a test the opposite seems true: using sorted
input is always slower:
Set1 insert unsorted 3.297000
Set2 insert sorted 3.500000 //uses a sorted vector a sinput
Set3 insert sorted 1.765000 //uses insert hint
Set4 insert sorted 3.625000 //uses a sorted vector a sinput, but
...
You may consider varying n in your example and observe how the run
times for sorted and unsorted construction *change*. Also:
I created a small test program where the size of the sets varies.
For visual express 2005 in release mode (and _HAS_ITERATOR_DEBUGGING set
to 0),
I get the following results:
filling 10000 sets of 1000 elements ... done in 5.031
filling 1000 sets of 10000 elements ... done in 6.078
filling 100 sets of 100000 elements ... done in 13.451
filling 10 sets of 1000000 elements ... done in 26.965
filling 1 set of 10000000 elements ... done in 36.917
This suggests that insertion of n sorted elements is not lineair with this
implementation.
When I modify the program to insert the values one by one, I get the same
timing.
So it seems that this implementation does not conform to the standard.
Another implementation of set (compiled with the same options) gives:
filling 10000 sets of 1000 elements ... done in 4.889
filling 1000 sets of 10000 elements ... done in 4.874
filling 100 sets of 100000 elements ... done in 4.874
filling 10 sets of 1000000 elements ... done in 4.875
filling 1 set of 10000000 elements ... done in 4.875
Here the function I used to test:
void settest()
{
const size_t total_elements = 10000000;
//
// initialize an array with numbers to insert into the sets
std::vector<size_t> numbers;
for (size_t i = 0; i < total_elements; i++) numbers.push_back(i);
for (size_t size = 1000; size <= total_elements; size *= 10)
{
size_t n_sets = total_elements / size;
std::vector< std::set<size_t> > sets(n_sets);
std::cout << "filling " << n_sets
<< (n_sets == 1? " set ": " sets ") << " of " << size
<< " elements ..." << std::flush;
clock_t before = clock();
for (size_t i = 0; i < n_sets; i++)
{
std::set<size_t> new_set(&numbers[0], &numbers[0] + size);
new_set.swap(sets[i]); // prevent destruction
}
clock_t after = clock();
std::cout << " done in "
<< double(after - before) / CLOCKS_PER_SEC << std::endl;
}
}
All the requirements in the Standard are about the number of 'calls'.
requirements go up. When the entire process can fit in level 1 cache the
code will seem to fly. When the code requires repeated paging to backing
store you will wonder if the system is actually still working.
to backing store. In addition there is the issue of locality. If the
be sufficiently spread to induce paging even though there are not that many.
the test machine.
[ comp.lang.c++.moderated. First time posters: Do this! ]