Re: searching in set vs. vector
Mehturt@gmail.com wrote:
> I have found an article with 50 STL tips.. I am wondering about tip
> #23:
> -- snip --
> 23) Consider replacing associative containers with sorted vectors.
> Associative containers are implemented as balanced binary
> search trees, good for a mix of insertions, erasures and
> lookups. If insertions are mostly separate from lookups, you
> can use vectors which would take less memory, sort it and use
> lookups at a cost of log n.
> -- snip --
> I have made a test on GNU/Linux using g++ 4.0 comparing
> searches in std::vector<int> and std::set<int> and it seems
> searches in sorted vector are faster only for small sizes
> (about 10 elements), for anything bigger std::set<int> is much
> faster (the more elements, the greater the difference). I
> mean I'm using std::set<> and std::map<> not only to store
> sorted data, but only to have fast searches in the container..
> The memory usage is not really a case for me..
> So perhaps somebody can explain this tip a bit further.. I can
> post the code I did for testing - if that is relevant..
It might be, but then...
The results will obviously depend on the implementation. My own
benchmarks (done some time ago, on a Sun Sparc) show std::map
slightly faster for two data sets (one with 75 entries, the
other with 15047), a sorted vector faster for the other three
(1275, 94883 and 1000000 entries). But the difference was never
really very big -- generally not enough to worry about.
I suspect that not only the number of entries, but the data
types, could make a difference. In my case, all of my
comparisons involved strings -- std::string, with dynamically
allocated memory. If dynamically allocated memory is not
involved, the better locality of std::vector could make an
important difference. (The string implementation of the
compiler I used at the time did not use the short string
optimization. Given my data sets, that optimization would have
made a significant difference in the three larger data sets,
which has a lot of very short strings, and probably none in the
two smaller sets, which were sets of URL's I'd extracted from
various places.)
--
James Kanze kanze.james@neuf.fr
Conseils en informatique orient?e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France +33 (0)1 30 23 00 34
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]