Re: template size and speed
 
{ please do not quote signatures. -mod }
James Kanze wrote:
charles.lobo@gmail.com wrote:
I have recently begun using templates in C++ and have found it to be
quite useful. However, hearing stories of code bloat and assorted
problems I decided to write a couple of small programs to check. What I
expected was that there would be minor code bloat and some speed
improvement when using templates. However...
I wrote a basic list container (using templates), and a list container
(using virtual derived classes). I also tried with the std lib vector
class. I ran 1000000 inserts and accesses to all these lists and got
the following results (compiling on windows with microsoft compiler):
Size:
22,528 mytpl.exe
36,864 nonstd.exe
40,960 std.exe
The first is my template list, the second my non-template list, and the
third is using the std vector.
Without seeing your actual code, it's impossible to know what
you are really testing.
The first surprise was that my template list was actually *smaller*
than the non-template version! Very surprising.
Not really, if you only instantiated the template for a single
type.  Calling a non-virtual function takes less space than
calling a virtual function, and the rest of the code should be
more or less identical.
Try using list<int>, list<long>, list<double>, etc. in a single
program.  As you increase the number of different
instantiations, the template versions will probably grow faster
than the non-template versions.  Probably, not certainly,
because there are well known techniques for factoring common
code out of the templates, and some compilers are also capable
of merging identical functions.
However, it's not all
good news. I then ran some timing tests.
Time:
875: running std.exe
1484: running nonstd.exe
1563: running mytpl.exe
As expected, the std vector is the fastest. However my template list
class is the *slowest*!!! This was definitly not expected.
Does anyone have any inputs on what's going on?
We'd have to see your code.  And also, how you did your timing
tests.  (And how you measured size as well; the size of the
executable file isn't necessarily very significant.)
--
James Kanze (GABI Software)             email:james.kanze@gmail.com
Conseils en informatique orientie objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place Simard, 78210 St.-Cyr-l'Icole, France, +33 (0)1 30 23 00 34
Hi,
I tried using list<int>, list<double>, list<MyClass> and list<MyClass2>
in both types of lists.
Both the code sizes grow as shown below:
mytpl.exe:
    With 1 type of list:    22,016
    With 4 types of lists:  29,184
nonstd.exe
    With 1 type of list:    36,864
    With 4 types of lists:  45,056
I am measuring exe size so maybe it's not the best way. But the
template list is still smaller than the non-template version. Maybe
this is the overhead of the virtual function you mention.
I running the test cases like this:
--------------- mytpl.cpp ------------------------
void test1 () {
        MyTList<int>            one;
        ListNode<int> *         onenode;
        long                    sumone;
    for (int i = 0;i < 1000000;i++) {
        one.Insert (new int (i));
    }
    sumone  =   0;
    onenode =   one.GetNodes ();
    while (onenode) {
        sumone +=   *(onenode->uVal);
        onenode =   onenode->uNext;
    }
}
void test2 () {
    // ... identical to test1 but for double
}
void test3 () {
    // ... identical to test1 but for MyClass
}
void test4 () {
    // ... identical to test1 but for MyClass2
}
void main (int argc, char * argv[]) {
    test1 ();
    test2 ();
    test3 ();
    test4 ();
}
--------------- mytpl.cpp ends--------------------
--------------- nonstd.cpp -----------------------
void test1 () {
        MyList                  one;
        IntElem *               onenode;
        long                    sumone;
    for (int i = 0;i < 1000000;i++) {
        one.Insert (new IntElem (new int (i)));
    }
    sumone  =   0;
    onenode =   (IntElem*)one.GetNodes ();
    while (onenode) {
        sumone  +=  *onenode->uI;
        onenode =   (IntElem *)onenode->uNext;
    }
}
void test2 () {
    // ... identical to test1 but for double
}
void test3 () {
    // ... identical to test1 but for MyClass
}
void test4 () {
    // ... identical to test1 but for MyClass2
}
void main (int argc, char * argv[]) {
    test1 ();
    test2 ();
    test3 ();
    test4 ();
}
--------------nonstd.cpp ends----------------------
I measured timing by running a timer app on the generated exe's (again
maybe not the best way):
----------- timer.cpp -----------------------------
void run (char * app) {
        PROCESS_INFORMATION pi;
        STARTUPINFO         si;
    ZeroMemory (&si, sizeof (si));
    si.cb 			= 	sizeof (si);
    si.lpReserved   =   NULL;
    ZeroMemory (&pi, sizeof (pi));
    CreateProcess (NULL, app, NULL, NULL, FALSE, 0, NULL, NULL, &si,
&pi);
    WaitForSingleObject (pi.hProcess, INFINITE);
}
void main (int argc, char * argv[]) {
        DWORD   start;
        DWORD   diff;
    if (argc != 2) {
        printf("Usage: timer <program to time>\n");
        return;
    }
    start   =   GetTickCount ();
    run (argv[1]);
    diff    =   GetTickCount () - start;
    printf("%d: running %s\n", diff, argv [1]);
}
--------------- timer.cpp ends --------------------
These were the timing results:
running mytpl.exe
    With 1 type of list:    781
    With 4 types of lists:  3562
running nonstd.exe
    With 1 type of list:    766
    With 4 types of lists: 3156
The results seem consistent when using one list or four. The template
version is still smaller and (marginally) slower.
cheers,
/Charles
-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]