Re: Do you use a garbage collector?
In article <4I2dnW-XHO6kbGPanZ2dnUVZ_sGvnZ2d@comcast.com>,
cristom@comcast.net says...
[ ... ]
A trivially simple design is for each thread to have its own heap, and
each heap to have one free list for every other thread in the system.
How does this efficently handle dynamic numbers of threads that can, and
will be created and joined with, or detached, during the lifetime of a given
process?
That depends -- if the number of threads is _extremely_ dynamic (i.e.
you have no idea of the maximum number of threads) it would probably
become a bit of a problem, though that's purely a guess, not the result
from testing.
Keep in mind that this design isn't (and never was) intended as a
polished product, but strictly an example of one idea about how a
reasonably scalable allocator _could_ work. Just as no battle plan
survives contact with the enemy, such an untested design almost
certainly wouldn't survive implementation unscathed. OTOH, the idea has
been voiced that cross-thread freeing is necessarily serialized, and I
think this is enough to show that's not the case.
It seems like you are describing a fully connected network. One
freelist per-thread, that holds N buckets, where N == current number of
threads at any one time. This implies that N is an unknown variable,
therefore, if user creates a number of threads, so on a high-end system with
many cores, they must all be interconnected. Each threads heap needs an
entry for a thread in the system, if a new thread gets added, it must auto
register, or register on "first" usage; think 'pthread_once()'.
Right. Mildly non-trivial, but definitely not rocket science either. The
design is almost certainly wasteful of memory, but the amount wasted
wouldn't become very large until/unless your system included thousands
of threads (or something on that general order).
This contention can normally be reduced to a single atomic exchange to
free memory and another to grab memory for coalescing. Specifically,
consider a structure something like:
struct free_block {
void *next;
size_t size;
size_t origin_threadID;
char payload[0];
};
Actually, the free-block only needs to be == to sizeof(void*):
Oh, I realize it doesn't need to be larger -- this wasn't an attempt at
an optimized implementation at all, but purely an attempt at
demonstrating the general idea in a minimum of code, so it includes no
"tricks" of any kind at all.
[ rest snipped -- I haven't had a chance to read through the cited
references yet... ]
--
Later,
Jerry.
The universe is a figment of its own imagination.