Re: C++ Memory Management Innovation: GC Allocator

From:

"Chris Thomasson" <cristom@comcast.net>

Newsgroups:

comp.lang.c++.moderated

Date:

Thu, 24 Apr 2008 01:43:35 CST

Message-ID:

<-dudnUtxBYpLl43VnZ2dnUVZ_oKhnZ2d@comcast.com>

"xushiwei" <xushiweizh@gmail.com> wrote in message
news:9be66f7e-e89e-4a16-b976-7da3f7565e82@y38g2000hsy.googlegroups.com...
[...]

The difference between StreamFlow and ScopeAlloc (or AutoFreeAlloc)
are:

1. StreamFlow uses thread local storage, while ScopeAlloc doesn't.

There are benefits for using TLS. Think of a simple contrived scenario like:

void function() {
// I need to allocate from m_alloc...
// How can I do this without adding any parameters?
}

void thread() {
GenericAllocator m_alloc;
function();
}

AFAICT, your allocator can use TLS as-is... Basically, something like:

void function() {
GenericAllocator* const pm_alloc = pthread_getspecific(...);
// Now I can allocate from m_alloc! :^D
}

void thread() {
GenericAllocator m_alloc;
pthread_setspecific(..., &m_alloc);
function();
}

Don't you think that your design could "possibly" benefit from using TLS?
IMVHO, it would increase its flexibility...

2. StreamFlow provides global allocation procedures, while ScopeAlloc
uses non-static allocator instances to allocate memory.

StreamFlow directs global allocation function's directly to the
calling threads heap... Something like:

void* malloc(size_t sz) {
PerThreadHeap* const _thisheap = pthread_getspecific(...);
return _thisheap->allocate(sz);
}

void free(void* ptr) {
PerThreadHeap* const _thisheap = pthread_getspecific(...);
_thisheap->deallocate(_thisheap);
}

Using global allocation functions does not limit performance in
any way, shape or form.

3. ScopeAlloc forbid you to deallocate memory by yourself.

IMVHO, many low-level programs "need" the flexibility to be able
to control exactly when a piece of memory should be returned to
the system.

GC Allocator doesn't forbid you to create in thread A and free in
thread B. But it is not recommended. However, GC Allocator forbid you
to deallocate memory manually. So, if you implement algorithms that
are related to t (that is, how many time of the algorithm spent is
unsure.), GC Allocator doesn't fit you directly.

What does the algorihtm look like for remote deallocations? Here is
how I do it in my vZOOM allocator:
______________________________________________________________
- create a thread-local instance of a user-defined single-threaded
allocator in every thread (e.g., ms heap w/ HEAP_NO_SERIALIZE).

- allocation requests are forwarded to this thread-local user allocator
directly.

- if free request goes from thread that allocated block (e.g., the origin
thread), then free request is forwarded to this thread-local user allocator.

- if free request goes from another thread, then you accumulate this block
in per-thread stack-based freelist "belonging to the origin thread", using
single atomic CAS.

- blocks from this freelist is actually reused/freed in batches using
single atomic SWAP when thread allocates/deallocates another block. For
instance, a thread that fails to allocate from its thread-local heap will do
a SWAP on the freelist and try and fulfill the allocation request from
there.
______________________________________________________________

My algorithm works with basically any single-threaded allocator, AFAICT,
it would even work with your existing code. You would not need to change
anything. In fact, and end-user could plug your code into this algorithm,
and it would just work.

Any thoughts?

[...]

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]