Re: Benchmarks
s0suk3@gmail.com writes:
The task: Write a program that reads a set of words from standard
input and prints the number of distinct words.
I came across a website that listed a few programs to accomplish this
task: http://unthought.net/c++/c_vs_c++.html (ignore all the language
flaming :-), and thought that all of them did unnecessary operations,
so I wrote my own. But for some reason, my version turned out slower
that ALL of the versions in the website, even though it seems to
perform less operations (yes, I benchmarked them on my own computer).
According to the website, the slowest version is:
#include <set>
#include <string>
#include <iostream>
int main(int argc, char **argv)
{
// Declare and Initialize some variables
std::string word;
std::set<std::string> wordcount;
// Read words and insert in rb-tree
while (std::cin >> word) wordcount.insert(word);
// Print the result
std::cout << "Words: " << wordcount.size() << std::endl;
return 0;
}
My version is about 12 times slower than that. It uses lower-level
constructs. Here it is:
[snip]
// Inserts a word into the set if it isn't in the set.
// The passed string is expected to have been allocated with
// a memory allocation function, and it should be considered
// lost after passed to this function.
//
static void
InsertWord(char *aWord)
{
struct SetNode *node;
for (node = gSet; node; node = node->next) {
if (strcmp(node->word, aWord) == 0) {
free(aWord);
return;
}
}
You represent your set of words as a linked list. You compare each
new word to every word already in the set. The C++ solution uses a
std::set which, if I recall correctly, can do searches and insertions
in O(n log n).
If you re-write this to use a balanced binary tree, such as an AVL
tree, you should get performance similar to the C++ version.
node = (struct SetNode *) malloc(sizeof(struct SetNode));
Not incorrect, but
node = malloc(sizeof *node);
would be better.
if (!node) {
free(aWord);
return;
}
And if malloc fails, you quietly return without doing anything to
handle the error or report it to the user.
[...]
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"