Re: Generally, are the programs written by C++ slower than written by C 10% ?
Asger-P <junk@asger-p.dk> wrote in news:op.v1ehx8bs2juju3@ajwin7:
Hi Paavo
Modified the test a little to make sure the std::string
actually was created, it makes quite a difference:
std:strings were all created before as well, no problem with that.
#include <iostream>
#include <string>
#include <cstring>
#include <time.h>
#include <stdlib.h>
#include <conio.h>
Not a standard header
int _tmain(int argc, char* argv[])
Not a standard signature
{
clock_t tbeg;
char *cStr = "123456789012345678901234567890";
const int L = strlen( cStr ) + 1;
tbeg = clock();
for (int i = 0; i < 10000000; i++)
{
std::string test(cStr);
if( test[5] == '0' )
std::cout << "error" << std::endl;
}
std::cout << "test 1 use " << clock() - tbeg << std::endl;
tbeg = clock();
for (int i = 0; i < 10000000; i++)
{
char* str = new char[L];
strcpy(str, cStr);
if( str[5] == '0' )
std::cout << "error" << std::endl;
delete [] str;
}
std::cout << "test 2 use " << clock() - tbeg << std::endl;
tbeg = clock();
for (int i = 0; i < 10000000; i++)
{
char* str = (char*)malloc(L);
strcpy(str, cStr);
if( str[5] == '0' )
std::cout << "error" << std::endl;
free(str);
}
std::cout << "test 3 use " << clock() - tbeg << std::endl;
getch();
Not a standard function
return 0;
}
test 1 use 1451
test 2 use 998
test 3 use 749
I get:
test 1 use 688
test 2 use 620
test 3 use 586
Are you sure you are using an optimized build?
why is malloc faster then new ??
new typically forwards to malloc behind the scenes, so it is a
malloc+something extra (at least one function call more). As this test
code does not do much more (strcpy() gets inlined by my compiler), I
guess an extra function call starts to show up.
std::string seems to be slower because std::string::assign() (called by
the inlined constructor) and the destructor call were not inlined by my
compiler. This probably means 2 non-inlined function calls more than the
"new" version.
In a real app, memory transfer and cache misses would probably kick in
when processing 10000000 strings; as the main memory access is nowadays
very slow compared to CPU, these cache misses would probably dominate the
overall speed and things like data locality would start to play. In other
words, this simple performance comparison is not very useful for
practical purposes, except of showing that the performance of all these
approaches is comparable.
Cheers
Paavo