Re: atomically thread-safe Meyers singleton impl (fixed)...
"Anthony Williams" <anthony.ajw@gmail.com> wrote in message
news:uhca74h7e.fsf@gmail.com...
"Chris M. Thomasson" <no@spam.invalid> writes:
"Anthony Williams" <anthony.ajw@gmail.com> wrote in message
news:u63qn63yk.fsf@gmail.com...
"Chris M. Thomasson" <no@spam.invalid> writes:
[...]
The algorithm used by boost::call_once on pthreads platforms is
described here:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2444.html
It doesn't use a
lock unless it has to and is portable across threads and win32
threads.
The code I posted does not use a lock unless it absolutely has to
because it attempts to efficiently take advantage of the double
checked locking pattern.
Oh yes, I realise that: the code for call_once is similar. However, it
attempts to avoid contention on the mutex by using thread-local
storage. If you have atomic ops, you can go even further in
eliminating the mutex, e.g. using compare_exchange and fetch_add.
[...]
Before I reply to your entire post I should point out that:
http://groups.google.com/group/comp.lang.c++.moderated/msg/e39c7aff738f9102
the Boost mechanism is not 100% portable, but is elegant in
practice.
Yes. If you look at the whole thread, you'll see a comment by me there
where I admit as much.
Does the following line:
__thread fast_pthread_once_t _fast_pthread_once_per_thread_epoch;
explicitly set `_fast_pthread_once_per_thread_epoch' to zero? If so, is it
guaranteed?
It uses a similar technique that a certain distributed
reference counting algorithm I created claims:
I wasn't aware that you were using something similar in vZOOM.
Humm, now that I think about it, it seems like I am totally mistaken. The
"most portable" version of vZOOM relies on an assumption that pointer
load/stores are atomic and the unlocking of a mutex executes at least a
release-barrier, and the loading of a shared variable executes at least a
data-dependant load-barrier; very similar to RCU without the explicit
#LoadStore | #StoreStore before storing into a shared pointer location...
Something like:
____________________________________________________________________
struct foo {
int a;
};
static foo* shared_f = NULL;
// single producer thread {
foo* local_f = new foo;
pthread_mutex_t* lock = get_per_thread_mutex();
pthread_mutex_lock(lock);
local_f->a = 666;
pthread_mutex_unlock(lock);
shared_f = local_f;
}
// single consumer thread {
foo* local_f;
while (! (local_f = shared_f)) {
sched_yield();
}
assert(local_f->a == 666);
delete local_f;
}
____________________________________________________________________
If the `pthread_mutex_unlock()' function does not execute at least a
release-barrier in the producer, and if the load of the shared variable does
not execute at least a data-dependant load-barrier in the consumer, the
"most portable" version of vZOOM will NOT work on that platform in any way
shape or form, it will need a platform-dependant version. However, the only
platform I can think of where the intra-node memory visibility requirements
do not hold is the Alpha... For multi-node super-computers, inter-node
communication is adapted to using MPI.