Re: C++1x: lock-free algos and blocking
"Dmitriy Vyukov" <dvyukov@gmail.com> wrote in message
news:9ca49d0d-5704-4863-ac5a-0d872de5f0f2@k19g2000yqc.googlegroups.com...
On 18 ???, 21:03, "Chris M. Thomasson" <n...@spam.invalid> wrote:
C++1x provides a complete toolset for lock-based programming: mutex
for ensuring consistency and condition variable for blocking/
signaling.
However, as for lock-free (non-blocking) programming C++1x provides
only atomics for ensuring consistency. And what about thread blocking/
signaling?
[...]
Humm. Well, the "simplest" algorithm I know of is basically equal to the
original eventcount algorithm I posted here, except it uses different
waitbit value and increments the counter by an even number:
Should something like this be standardized? I am not so sure here simply
because one can rather easily use C++1x to create a 100% portable
implementation of an eventcount... Humm...
Chris, probably your are right in the sense that if one builds lock-
free algorithms then he must be able to build blocking/signaling logic
too. And std::condition_variable+std::mutex is a sufficient toolset
for that.
Since you've mentioned an eventcount, I know what you will hear next -
"write your proposal" :)
:^)
Btw, here is "a more C++" interface for an eventcount, I've designed
for a contribution to Intel TBB. It supports RAII waiting, and
signaling with arbitrary predicate, with which you may encode for
example "wakeup only thread that waits for exactly this
element" (useful for producer-consumer queues), or "wakeup all threads
that satisfy this condition but at least 1 thread" (useful for work-
stealing environments, condition means data affinity): such fine-
grained signaling avoids unnecessary thread wakeups and thundering
herd, the idea is that it's better to do some more work on producer
thread but avoid boatload of senseless thread wakeups/blocks:
Totally agreed. The ability to signal a single thread is definitely useful.
Furthermore, the ability to signal a subset of waiting threads which satisfy
some conditions is also very useful. You're implementation that is based on
TLS makes this easy. If you're fine with `SCHED_OTHER' then a complete naive
implementation with linked-list of waiters per-eventcount is fine. FWIW, you
can get single thread signaling with a tweaked form of the eventcount
algorithm I created. It goes something like:
_________________________________________________________
class eventcount
{
public:
typedef unsigned long key_type;
private:
mutable std::atomic<key_type> m_count;
unsigned int m_waiters;
std::mutex m_mutex;
std::condition_variable m_cond;
private:
void prv_signal(key_type key, bool broadcast)
{
if (key & 1)
{
m_mutex.lock();
unsigned int waiters = m_waiters;
key_type xchg;
do
{
xchg = key + 2;
if (waiters < 2 || broadcast)
{
xchg &= ~1;
}
}
while (! m_count($).compare_exchange_weak(
key,
xchg,
std::memory_order_seq_cst));
m_mutex.unlock();
if (waiters)
{
if (waiters == 1 || ! broadcast)
{
m_cond.notify_one();
}
else
{
m_cond.notify_all();
}
}
}
}
public:
eventcount()
: m_count(0),
m_waiters(0)
{
}
public:
key_type get() const
{
return m_count.fetch_or(1, std::memory_order_acquire);
}
void signal()
{
prv_signal(m_count.fetch_add(0, std::memory_order_seq_cst), false);
}
void signal_relaxed()
{
prv_signal(m_count.load(std::memory_order_relaxed), false);
}
void broadcast()
{
prv_signal(m_count.fetch_add(0, std::memory_order_seq_cst), true);
}
void broadcast_relaxed()
{
prv_signal(m_count.load(std::memory_order_relaxed), true);
}
void wait(key_type cmp)
{
m_mutex.lock();
key_type cur = m_count.load(std::memory_order_seq_cst);
if ((cur & ~1) == (cmp & ~1))
{
++m_waiters;
m_cond.wait(m_mutex);
--m_waiters;
}
m_mutex.unlock();
}
};
_________________________________________________________
Of course this is not sufficient for a conditional signaling technique.
class eventcount::wait_guard
{
public:
wait_guard(eventcount& ec, void* ctx = 0);
void commit_wait();
};
And predicate for notify must be of the form:
bool pred(void* ctx, size_t thread_count, size_t thread_index);
// ctx is a arbitrary values passed to prepare_wait (can be allocated
on a stack of a blocked thread)
// return value determines unblock thread or not
That's definitely useful!
:^)