Re: atomically thread-safe Meyers singleton impl (fixed)...
"Anthony Williams" <anthony.ajw@gmail.com> wrote in message
news:ud4kv4fbp.fsf@gmail.com...
"Chris M. Thomasson" <no@spam.invalid> writes:
[...]
the Boost mechanism is not 100% portable, but is elegant in
practice.
Yes. If you look at the whole thread, you'll see a comment by me there
where I admit as much.
Does the following line:
__thread fast_pthread_once_t _fast_pthread_once_per_thread_epoch;
explicitly set `_fast_pthread_once_per_thread_epoch' to zero? If so,
is it guaranteed?
The algorithm assumes it does, but it depends which compiler you
user. In the Boost implementation, the value is explicitly
initialized (to ~0 --- I found it worked better with exception
handling to count backwards).
It uses a similar technique that a certain distributed
reference counting algorithm I created claims:
I wasn't aware that you were using something similar in vZOOM.
Humm, now that I think about it, it seems like I am totally
mistaken. The "most portable" version of vZOOM relies on an assumption
that pointer load/stores are atomic and the unlocking of a mutex
executes at least a release-barrier, and the loading of a shared
variable executes at least a data-dependant load-barrier; very similar
to RCU without the explicit #LoadStore | #StoreStore before storing
into a shared pointer location... Something like:
// single producer thread {
foo* local_f = new foo;
pthread_mutex_t* lock = get_per_thread_mutex();
pthread_mutex_lock(lock);
local_f->a = 666;
pthread_mutex_unlock(lock);
shared_f = local_f;
So you're using the lock just for the barrier properties. Interesting
idea.
Yes. Actually, I did not show the whole algorithm. The code above is busted
because I forgot to show it all; STUPID ME!!! Its busted because the store
to shared_f can legally be hoisted up above the unlock. Here is the whole
picture... Each thread has a special dedicated mutex which is locked from
its birth... Here is exactly how production of an object can occur:
static foo* volatile shared_f = NULL;
// single producer thread {
00: foo* local_f;
01: pthread_mutex_t* const mem_mutex = get_per_thread_mem_mutex();
02: local_f = new foo;
03: local_f->a = 666;
04: pthread_mutex_unlock(mem_mutex);
05: pthread_mutex_lock(mem_mutex);
06: shared_f = local_f;
}
Here are the production rules wrt POSIX:
1. Steps 02-03 CANNOT sink below step 04
2. Step 06 CANNOT rise above step 05
3. vZOOM assumes that step 04 has a release barrier
Those __two__guarantees__and__single__assumption__ ensure the ordering and
visibility of the operations is correct. After that, the consumer can do:
// single consumer thread {
00: foo* local_f;
01: while (! (local_f = shared_f)) {
02: sched_yield();
}
03: assert(local_f->a == 666);
04: delete local_f;
}
Consumption rules:
01: vZOOM assumes that the load from `shared_f' will have implied
data-dependant load-barrier.
BTW, here is a brief outline of how the "most portable" version of vZOOM
distributed reference counting works with the above idea:
http://groups.google.ru/group/comp.programming.threads/msg/59e9b6e427b4a144
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/fe24fe99f742ce6e
(an __execlelent__ question from Dmitriy...)
What do you think Anthony?