Re: C++ vs C when it comes to speed...

From:

Alan Johnson <awjcs@yahoo.com>

Newsgroups:

comp.lang.c++

Date:

Thu, 01 Mar 2007 14:51:36 -0800

Message-ID:

<es7lhk$1in$1@aioe.org>

mast2as@yahoo.com wrote:

I am sure this topic has been discussed a thousand times and I read a
few things about it today on the net. I also want to say I am trying
to start a polemic here, I am just curious and willint to learn and
improve the way I am approaching some coding issues that I have at the
moment. I use C++ programming for my work, but I am not a developper
so please be patient & tolerant in your answers ;-)

Okay for the last few days I have been struggling with mixing some C
and C++ code. I usually try to make my code C++ all the way through.
Recently because I had to implement some old C specs I decided for
some weird reasons to mix C and C++ code. Not a good thing to do I
know. Anyway I really struggled with the idea for the last few days
and spend quite some time going back & fort different versions of the
code, some had more C than C++, some where only C++. I must say that
in what I am doing, execution SPEED is important.

So today I decided to do this very simple test. I wrote the same
functionalities but one version is C++ the other C and run these
functions in a loop (10 000 times).

The C++ versions takes 5 seconds
The C version takes 1 second to execute

This a big difference. I realised that the difference is mostly coming
in the push_back function of the std::vector class. If I comment that
line out, the C++ code runs in 1 second. I used to find the STL lib
very very convenient but I never realised they had such an impact of
the application performances. Here is the program that I used for the
test... Maybe I am doing something wrong so I apologize in advance.

As I said in the pre-ambule of the post, I am trying to be
constructive. The feedbacks I would like to have are more:
1/ i am doing something wrong in the C++ implementation that would
slow it down.
2/ is it a good thing to do to use C coding in a C++ app if speed is
an issue and I want the app to run as fast as it could.

Thanks everyone.

// 1. comparing speed C vs C++
// Running on Max OS X, Power PC G4, 1.5 Ghz
// c++ -o ribparser ribparser.cpp

#include <stdlib.h>
#include <stdio.h>

#include <fstream>
#include <string>
#include <vector>

#include <ctime>

class RibParser
{
  std::string m_ribFile;
public:
  std::ifstream ifs;
  RibParser( std::string ribFile ) : m_ribFile( ribFile )
  {
    char rixm[512];
    try
    {
      ifs.open( ribFile.c_str() );
      if ( ifs.fail() )
      {
        sprintf( rixm, "%s: Can't open\n", ribFile.c_str() );
        throw( rixm );
      }
    }
    catch( char *rixm )
    {
      printf( rixm );
      ifs.close();
      exit( 0 );
    }
    int ch;
    std::vector<char> token;
    while( ! ifs.eof() )
    {
      ch = ifs.get();
      // comment this line out and the app runs in 1 second
      token.push_back( ch );
    }
  }
  ~RibParser()
  {
    ifs.close();
  }
};

static const size_t ARRAY_INCR = 8;

void RibParserC( const char *ribFile )
{
  char *token;
  size_t tokenByteSize = 0;
  size_t tokenArraySize = ARRAY_INCR;
  FILE *source;
  if ( ( source = fopen( ribFile, "r" ) ) == NULL )
  {
    printf( "%s: Can't open\n", ribFile );
    fclose( source );
    exit( 0 );
  }
  token = (char*)malloc( ARRAY_INCR );
  int ch;
  do
  {
    ch = fgetc( source );
    token[tokenByteSize] = ch;
    tokenByteSize++;
    if ( ( tokenByteSize % tokenArraySize ) == 0 )
    {
      token = (char*)realloc( token, tokenArraySize + ARRAY_INCR );
      tokenArraySize += ARRAY_INCR;
    }
  } while ( ch != EOF );
  fclose( source );
  free( token );
}

int main( int argc, char ** argv )
{
  time_t start, end;
  time( &start );
  for ( size_t i = 0; i < 10000; ++i )
  {
    RibParser ribParser( "/Users/jean-colas/Desktop/comment.rib" );
  }
  time( &end );
  double diff = difftime( end, start );
  printf( "seconds %f %d\n", diff, CLOCKS_PER_SEC );

  time( &start );
  for ( size_t i = 0; i < 10000; ++i )
  {
    RibParserC( "/Users/jean-colas/Desktop/comment.rib" );
  }
  time( &end );
  diff = difftime( end, start );
  printf( "seconds %f %d\n", diff, CLOCKS_PER_SEC );

  return 0;
}

/////

I couldn't reproduce your results because I didn't have
"/Users/jean-colas/Desktop/comment.rib". So instead I used a 50KB chunk
of random data from /dev/urandom.

Results:
seconds 28.000000 1000000
seconds 26.000000 1000000

After turning on optimization (g++ -O3 -o ribparser ribparser.cpp):
seconds 17.000000 1000000
seconds 23.000000 1000000

So I would claim that your implementations are competitive, at least
given the input I provided.

I have some guesses as to why you are seeing what you are seeing. A
function call:
token.push_back( ch );

has a much larger overhead than an assignment:
token[tokenByteSize] = ch;

A good compiler will probably get rid of most/all of that overhead when
you turn on optimizations.

Another observation is that your C implementation potentially has
quadratic time (depending on how realloc is implemented), while your C++
implementation has linear time. On a large enough input, I would expect
the C++ version to pull far ahead. This has nothing to do with the
language itself, though, but rather the quality of the algorithm you use.

--
Alan Johnson