Re: factor 50.000 between std::list and std::set?

From:

=?ISO-8859-1?Q?Erik_Wikstr=F6m?= <Erik-wikstrom@telia.com>

Newsgroups:

comp.lang.c++

Date:

Tue, 26 Jun 2007 07:19:19 GMT

Message-ID:

<X53gi.2954$ZA.1451@newsb.telia.net>

On 2007-06-26 00:30, desktop wrote:

Pete Becker wrote:

desktop wrote:

Erik Wikstr?m wrote:

On 2007-06-25 22:21, desktop wrote:

If I have a sorted std::list with 1.000.000 elements it takes
1.000.000 operations to find element with value = 1.000.000 (need to
iterator through the whole list).

In comparison, if I have a std::set with 1.000.000 element it will
only take approx lg 1.000.000 = 20 operations! Can it really be true
that the difference is a factor of 1.000.000/20 = 50.000 in this case?

In operations yes, not necessarily in time. If the operations on the
list takes 1 time and the operations on the set takes 50,000 then
they'll be equally fast. This will of course not be true in any
implementation (the set will be significantly faster than the list)
but it shows that just because one container/algorithm has a better
asymptotic running time it will in fact perform better. All it says
is that for a sufficiently large set of input, the algorithm will
perform better.

In practice you'll often find that using a vector for small sets will
be faster than most other containers, even if you need to traverse
the whole vector.

Is it possible to make an exact measurement in the difference in time
for 1 operation for a set and a list?

Yes, but that's not what asymptotic complexity is about. Asymptotic
complexity measures how well an algorithm scales when you increase the
amount of data. It answers questions like: it takes twenty seconds to
find all the records matching X in my database; if I double the number
of data elements, how long will it take?

I am not interested in the asymptotic difference but a measurement of
the difference in time for a single operation - this way its possible to
give an idea of how many elements you need to operate with before it
makes sense to use a more complicated structure with a better asymptotic
complexity.

The time one operation needs is usually not interesting since you rarely
us only one operation, instead you use a mix of a few. So what's
interesting is how long time your mix performs. Consider an application
where you perform a number of operations on a collection, let's say you
perform 10 operations of type A, 1 of type B and 100 of type C. Given
that mix it's quite uninteresting if the B operation is fast or not,
what matters is, probably, how well C performs, then A.

--
Erik Wikstr?m