Re: atomic memory_order with command or with fence
I will give this one example, where I'm pretty sure I can demostrate
what I mean.
However my question remains whether it's always so.
On May 29, 10:54 pm, Zoltan Juhasz <zoltan.juh...@gmail.com> wrote:
On Sunday, 27 May 2012 17:08:33 UTC-4, itaj sherman wrote:
I'll put this code in functions, to clarify the context of x and r:
template< typename T >
void my_store_release_1( std::atomic<T>& x, T r )
{
x.store( r, memory_order_release );
}
template< typename T >
void my_store_release_2( std::atomic<T>& x, T r )
{
std::atomic_thread_fence( memory_order_release );
x.store( r, memory_order_relaxed );
}
Disclaimer: I am most certainly not an expert on this area, but based
on my current understanding on the topic, I believe these are not
the same. Hopefully someone, who has more experience, will clarify.
A fence or atomic store operation that is marked with
'memory_order_release' introduces inter-thread, happens-before
relationship on store operations that appear before the
'memory_order_release' fence or atomic store operation - given
it is paired with an acquire counterpart.
but in order to syncheronize a release fence with an acquire fence,
you need an
atomic variable and a store on it sequenced after the release fence,
whose
value be read by a load that is sequenced before the acuire fence.
standard 29.8-p2:
A release fence A synchronizes with an acquire fence B if there
exist atomic operations X and Y, both
operating on some atomic object M, such that A is sequenced before
X, X modifies M, Y is sequenced
before B, and Y reads the value written by X or a value written by
any side effect in the hypothetical
release sequence X would head if it were a release operation.
operations X and Y in my code were meant to be x.store and x.load.
And this is why I ordered them inside the function before or after
the fence as I did, deliberately.
Conversely, it introduces no happens-before relationship on
operations that appear after the store / fence marked with
'memory_order_release', in regards their visibility in another
thread.
In this case the fence, marked with 'memory_order_release',
introduces no happens-before relationship on the store of x in
regards of the visibility of the store on x in another thread,
since the store appears after the fence.
Right, it doesn't order x, I didn't mean for it to. The point was for
x to
cause a synchronization (an optional one) on the fences. So that
stores that
were sequenced before the release fence, be certainly visible to loads
that
happen after the acquire fence.
So I can show the following use example, in which I think 1 and 2 are
equivalent.
But I'm looking for an answer whether it is always true.
std::atomic<int> atomic_data( 0 );
std::atomic<int> atomic_flag( 0 ); //change flag to 1 when data can be
read.
//thread#1
int data;
std::cin >> data;
atomic_data.store( data, memory_order_relaxed );
my_store_release_XXX( atomic_flag, 1 ); //XXX is one of the above
versions
//thread#2
int const current_flag = my_load_acquire_XXX( atomic_flag );
int const current_data = atomic_data.load( memory_order_relaxed );
if( flag == 1 ) {
//the atomic_flag store_release synchronizes with load_acquire
//therefor the atomic_data store happens before the load.
std::cout << "data arrived " << current_data; //must be what came in
std::cin
} else {
//no certain synchronization
std::cout << "no flag for data arrived "; //data maybe 0, maybe
already changed.
}
so, I expect we should agree without explanation that when using
my_store_release_1/my_load_acquire_1 this example works as expected.
(per standard 1.10).
Now regarding 29.8-p2, I assert that using my versions
my_store_release_2/my_load_acquire_2
this should work just the same, just in this example, because the code
would convert to:
//inlining the functions of versions 2:
//thread#1
int data;
std::cin >> data;
atomic_data.store( data, memory_order_relaxed );
std::atomic_thread_fence( memory_order_release ); // <-- fence A
atomic_flag.store( 1, memory_order_relaxed ); // <-- store operation X
//thread#2
int const current_flag = atomic_flag.load( memory_order_relaxed ); //
<-- load operation Y
std::atomic_thread_fence( memory_order_acquire ); // <-- fence B
int const current_data = atomic_data.load( memory_order_relaxed );
if( flag == 1 ) {
//in this case, the value of flag implies that fence A synchronized
with fence B per 29.8-p2
std::cout << "data arrived " << current_data; //must be what came in
std::cin
} else {
//no certain synchronization
std::cout << "no flag for data arrived "; //data maybe 0, maybe
already changed.
}
I will also assert that it will also work (in this example) when
changing just one
of the functions version, and thus mixing my_store_release_1/
my_load_acquire_2 or
my_store_release_2/my_load_acquire_1.
But this example is just one case, I want to know whether they are
always equivalent.
I believe if you write:
template< typename T >
void my_store_release_3( std::atomic<T>& x, T r )
{
x.store( r, memory_order_relaxed );
std::atomic_thread_fence( memory_order_release );
}
Then 1 and 3 are equivalent, as far as the introduced inter-thread
happens-before relationship is concerned.
The same I would ask about load and acquire:
template< typename T >
T my_load_acquire_1( std::atomic<T>& x )
{
T const r = x.load( memory_order_relaxed );
std::atomic_thread_fence( memory_order_acquire );
return r;
}
template< typename T >
T my_load_acquire_2( std::atomic<T>& x )
{
T const r = x.load( memory_order_acquire );
return r;
}
Situation is similar here, the "memory_order_acquire" does not
impose happens-before relationship on the load to x, since the
load appears before the fence.
The correct way is:
template< typename T >
T my_load_acquire_3( std::atomic<T>& x )
{
std::atomic_thread_fence( memory_order_acquire );
T const r = x.load( memory_order_relaxed );
return r;
}
On the other hand, I don't see that it would work with your version 3.
It actually seems like a counter example.
//inlining the functions of versions 3:
//thread#1
int data;
std::cin >> data;
atomic_data.store( data, memory_order_relaxed );
atomic_flag.store( 1, memory_order_relaxed ); // <-- store operation X
std::atomic_thread_fence( memory_order_release ); // <-- fence A
//thread#2
std::atomic_thread_fence( memory_order_acquire ); // <-- fence B
int const current_flag = atomic_flag.load( memory_order_relaxed ); //
<-- load operation Y
int const current_data = atomic_data.load( memory_order_relaxed );
if( flag == 1 ) {
//it might be possible to load the value of store operation X even
when fence A did not occur yet.
//in such a case, it is uncertain what value of atomic_data is
loaded.
std::cout << "data arrived " << current_data;
} else {
std::cout << "no flag for data arrived ";
}
itaj
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]