Re: Synchronization Algorithm Verificator for C++0x
"Dmitriy V'jukov" <dvyukov@gmail.com> wrote in message
news:8488efa1-9383-4232-b5a8-482852cee1f3@a1g2000hsb.googlegroups.com...
On 6 ???, 02:54, "Chris M. Thomasson" <n...@spam.invalid> wrote:
I would be interested to see how long it takes to detect the following
bug
you found in an older version of AppCore:
http://groups.google.com/group/comp.programming.threads/browse_frm/th...
Ok, let's see. I've just finished basic mutex and condvar
implementation.
Can you e-mail the new version to me please?
Here is eventcount:
class eventcount
{
[...]
void wait(unsigned cmp)
{
unsigned ec = count($).load(std::memory_order_seq_cst);
if (cmp == (ec & 0x7FFFFFFF))
{
guard.lock($);
ec = count($).load(std::memory_order_seq_cst);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I don't think that you need a memory barrier under the lock because the
only mutation which will make the following compare succeed is also
performed under the same lock. The signal-slow path is under the lock, and
the waiter slow-path is under the lock. However, I don't know if this
violates some rule in C++ 0x.
if (cmp == (ec & 0x7FFFFFFF))
{
waiters($) += 1;
cv.wait(guard, $);
}
guard.unlock($);
}
}
};
Here is the queue:
[...]
Here is the test:
[...]
It takes around 40 minutes.
Let's run it!
Here is output (note error detected on iteration 3):
struct spsc_queue_test
DEADLOCK
iteration: 3
Cool; it sure does detect the race-condition in the old AppCore! Very nice
indeed.
[...]
Operations on condition variables are not yet in execution history.
Let's replace 'ec.signal_relaxed()' with 'ec.signal()'. Here is the
output:
struct spsc_queue_test
[...]
iterations: 1000000
total time: 2422
throughput: 412881
Test passed. Throughput is around 400'000 test executions per second.
Beautiful. My eventcount algorihtm works!
=^D
Also test reveals some errors with memory fences.
In signaling function you are using acquire fence in cas, but relaxed
is enough here:
void signal_impl(unsigned cmp)
{
if (cmp & 0x80000000)
{
guard.lock($);
while (false == count($).compare_swap(cmp,
(cmp + 1) & 0x7FFFFFFF, std::memory_order_relaxed));
unsigned w = waiters($);
waiters($) = 0;
guard.unlock($);
if (w)
cv.notify_all($);
}
}
Right. The current public AppCore atomic API is not fine-grain enough to
allow for relaxed operations. I need to change that.
Also. Following naive approach doesn't work (at least in C++0x):
void signal()
{
std::atomic_thread_fence(std::memory_order_seq_cst);
signal_relaxed();
}
Good thing the MFENCE works in x86!
You have to execute sequentially consistent atomic RMW on 'ec'
variable here. Because sequentially consistent fence and sequentially
consistent atomic RMW operation doesn' t synchronize-with each other.
Hmmm... This is strange. But I can't find anything about this in
draft... Anthony Williams said that committee made some changes to
initial proposal on fences, but they are not yet published...
This is weird. a seq_cst fence does not work with seq_cst RMW!? Do you
happen to know the rational for this? IMVHO, it makes no sense at all... I
must be missing something.
:^/