Re: C++0x: release sequence

From:

Anthony Williams <anthony.ajw@gmail.com>

Newsgroups:

comp.programming.threads,comp.lang.c++

Date:

Mon, 16 Jun 2008 20:36:33 +0100

Message-ID:

<uy755mata.fsf@gmail.com>

"Dmitriy V'jukov" <dvyukov@gmail.com> writes:

On Jun 16, 3:09 pm, Anthony Williams <anthony....@gmail.com> wrote:

Relaxed ordering is intended to be minimal overhead on all systems, so
it provides no ordering guarantees. On systems that always provide the
ordering guarantees, putting memory_order_acquire on the fetch_add is
probably minimal overhead. On systems that truly exhibit relaxed
ordering, requiring that the relaxed fetch_add participate in the
release sequence could add considerable overhead.

Consider my example above on a distributed system where the processors
are conceptually "a long way" apart, and data synchronization is
explicit.

With the current WP, processor 2 only needs to synchronize access to
y. If the relaxed op featured in the release sequence, it would need
to also handle the synchronization data for x, so that processor 3 got
the "right" values for x and y.

In your example, yes, one have to use non-relaxed rmw. But consider
following example:

struct object
{
    std::atomic<int> rc;
    int data;

    void acquire()
    {
        rc.fetch_add(1, std::memory_order_relaxed);
    }

    void release()
    {
        if (1 == rc.fetch_sub(1, std::memory_order_release)
        {
            std::atomic_fence(std::memory_order_acquire);
            data = 0;
            delete this;
        }
    }
};

object* g_obj;

void thread1();
void thread2();
void thread3();

int main()
{
    g_obj = new object;
    g_obj->data = 1;
    g_obj->rc = 3;

    thread th1 = start_thread(&thread1);
    thread th2 = start_thread(&thread2);
    thread th3 = start_thread(&thread3);

    join_thread(th1);
    join_thread(th2);
    join_thread(th3);
}

void thread1()
{
    volatile int data = g_obj->data;
    g_obj->release(); // T1-1
}

void thread2()
{
    g_obj->acquire(); // T2-1
    g_obj->release(); // T2-2
    g_obj->release(); // T2-3
}

void thread3()
{
    g_obj->release(); // T3-1
}

From point of view of current C++0x draft this code contains race on
g_obj->data. But I think this code is perfectly legal from hardware
point of view.

I guess it depends on your hardware. The relaxed fetch_add says "I
don't care about ordering", yet your code blatantly does care about
the ordering. I can't help thinking it should be
fetch_add(1,memory_order_acquire).

Consider following order of execution:
T1-1
T2-1 - here release sequence is broken, because of relaxed rmw
T2-2 - but here release sequence is effectively "resurrected from
dead", because thread, which executed relaxed rmw, now execute non-
relaxed rmw
T2-3
T3-1

And there's the rub: I don't think this is sensible. You explicitly
broke the release sequence with the relaxed fetch_add, so you can't
resurrect it.

T2-1 is not ordered wrt the read of g_obj->data in thread1. If it
needs to be ordered, it should say so.

Anthony
--
Anthony Williams | Just Software Solutions Ltd
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL