Re: atomics and memory model: The release sequence and
synchronizes-with
{ Please limit your text to fit within 80 columns, preferably around 70,
so that readers don't have to scroll horizontally to read each line.
This article has been reformatted manually by the moderator. -mod }
On Sunday, 17 February 2013 03:56:22 UTC, fmatthew5876 wrote:
I've been reading the C++ concurrency in action chapter on the memory model
and there is one part I don't quite get in 5.3.4.
Consider the following modified example from the book
#include <atomic>
#include <thread>
SomeType data[20];
std::atomic<int> count;
void produce() {
//Fill up data with some meaningful values
count.store(20, std::memory_order_release);
}
void consume() {
while(true) {
int index;
if((index = count.fetch_sub(1,std::memory_order_acquire)) <= 0) {
wait_for_more_items();
continue;
}
//Index must be unique, 2 threads cannot get the same index
do_something_with_data(data[index-1]);
}
}
int main() {
std::thread a(produce);
std::thread b(consume);
std::thread c(consume);
a.join();
b.join();
c.join();
}
So the idea here is that thread a will fill up the data array and
then set the number of elements in the array.
Threads b and c will spin until the count is set and then start consuming
unique items in parallel. We don't want threads b and c to ever try to
consume the same item.
Now to me this looks like a bug. The problem being with the line:
count.fetch_sub(1,std::memory_order_acquire).
fetch_sub is a read-modify-write operation. The acquire assures us it
will synchronize with the initial store to 20 from thread a. However there
is no release on the store part of the fetch_sub, which to me looks like
the store of fetch_sub from thread b will not synchronize with the load of
fetch_sub from thread c, allowing a possible situation where both threads
b and c could read the same value for index.
This is part of the "magic" of RMW operations --- they guarantee that they
operate on the "latest" value of the variable. Two RMW ops A and B that
operate on the same variable are required to be ordered in some way by the
compiler/library/processor, so that either A is before B or B is before A.
This ordering only applies to operations on that variable unless memory
ordering constraints are used to confer ordering on operations on other
variables, or the RMW op is part of a release sequence.
If I were to write this, I would think to use
count.fetch_sub(1,std::memory_order_acq_rel).
However, the use of memory_order_acquire according to the book is in fact
correct because of something called the "release sequence."
From my limited understanding, a release sequence starts with an initial
store with release, acq_rel, or seq_cst and a final load with acquire,
consume, or seq_cst. The book says that inside the release sequence you
have have any number of read-modify-write operations with *any* memory
ordering.
Can anyone elaborate on why this works? I think I sort of get it but it still
seems like voodoo.
Since the RMW ops on a given variable provide a total order, ordering
constraints can be passed "down the chain".
On x86 this is easy: all atomic RMW ops are memory_order_seq_cst at the
processor level (though if they are not tagged memory_order_seq_cst in
the source the compiler may reorder things around them).
On other architectures the compiler may need to issue fences or barriers
or other synchronization instructions, but it is required to work.
If RMW ops on a given variable didn't form a total order then they would
lose their usefulness --- there would be no benefit over separate read/
modify/store ops if you could get two fetch_subs reading the same value.
Anthony
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]