Re: Efficient per-class allocator
On Oct 24, 2:02 pm, Davin Pearson <davin.pear...@gmail.com> wrote:
Consider the following g++ code:
#include "../../2006/nogc2/global.hh"
class Foo
{
int data;
};
extern Foo* create();
extern void func();
extern void test1(int ILEN);
extern void test2(int ILEN);
int main()
{
allegro_init_first_windowed(640,480);
const int len = 1000 * 1000;
retrace_count = 0;
test1(len);
int time_create = retrace_count;
retrace_count = 0;
test2(len);
int time_none = retrace_count;
textout(screen,font,(string() + "time1=" +
time_create).const_char_star(),0,0,allegro_col_white);
textout(screen,font,(string() + "time2=" +
time_none).const_char_star(),0,8,allegro_col_white);
readkey();
return EXIT_SUCCESS;}
END_OF_MAIN();
void func()
{
}
Foo* create()
{
return new Foo();
}
void test1(const int ILEN)
{
for (int i=0; i<ILEN; i++)
{
Foo* f = create();
delete f;
}
}
void test2(const int ILEN)
{
for (int i=0; i<ILEN; i++)
{
func();
func();
}
}
Apologies for the lack of standard code in the main function. I am
using the Allegro graphics library and my own string class. The rest
of the code is portable however. When I run the above code, it says
that the call to test1(1000 * 1000) takes 29 / 70 seconds and the call
to test2(1000 * 1000) takes 1 / 70 seconds. I would like to find a
per-class allocator that uses linked lists to provide performance that
takes as long as the call to test2 takes.
The answer is going to depend a lot oon the answers to :
1) is create() called by more than one thread?
2) is delete is called in the same thread that called create()?
3) do you need to trim excess capacity or not? ( recall the space/
speed tradeoff -- any solution that is going to be faster is going to
take up more memory from the heap)
4) how portable does this have to be?
5) do the members of Foo also allocate memory?
6) does Foo have a trivial destructor?
7) is this for shared memory, or process specific?
8) will it be used in an STL container?
I have a number of allocators that I use for these different
conditions. While getting the overhead down to the equivalent of two
empty function calls may be wishful thinking for a practical allocator
(ie it is not just a stack allocator) something close is very possible
when using more real lifechoices.
In fact today I am testing a version that does create on two thread,
delete in a third, does not trim excess capacity, where members of do
not allocated, but are nontrivial. It is lock free (except when it
needs to get some extra capacity occasionally). The allocate/
deallocate are O(1) operations, and in fact the normal case is just
inlined few pointer adjusments and assignments.
So please be more specific, and I'll be happy to show what I might do
to get the performance you need.
Lance
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]