Re: Singleton_pattern and Thread Safety

From:

Leigh Johnston <leigh@i42.co.uk>

Newsgroups:

comp.lang.c++

Date:

Sat, 11 Dec 2010 18:57:35 +0000

Message-ID:

<aI-dnTRysdoEVJ7QnZ2dnUVZ7sSdnZ2d@giganews.com>

On 11/12/2010 18:47, Chris M. Thomasson wrote:

"Leigh Johnston"<leigh@i42.co.uk> wrote in message
news:m4adnexxGJ1aMZ7QnZ2dnUVZ8jOdnZ2d@giganews.com...

On 11/12/2010 16:05, Chris M. Thomasson wrote:

"Leigh Johnston"<leigh@i42.co.uk> wrote in message
news:kY-dnahdNL25H57QnZ2dnUVZ7vOdnZ2d@giganews.com...
[...]

Hmm, I think I see why I might need the first barrier: is it due to
loads
being made from the singleton object before the pointer check causing
problems for *clients* of the function? any threading experts care to
explain?

http://lwn.net/Articles/5159

http://mirror.linux.org.au/linux-mandocs/2.6.4-cset-20040312_2111/read_barrier_depends.html

http://groups.google.com/group/comp.lang.c++.moderated/msg/e500c3b8b6254f35

Basically, the only architecture out there which requires a
data-dependant
acquire barrier after the initial atomic load of the shared instance
pointer
is a DEC Alpha...

[...]

Thanks, so in summary my version should work on my implementation
(IA-32/VC++) and probably would work on other implementations except DEC
Alpha for which an extra barrier would be required.

Are you referring to this one:

http://groups.google.com/group/comp.lang.c++/msg/547148077c2245e2

I believe I have kind of solved the problem. I definitely see what you are
doing here and agree that you can get a sort of "portable" acquire/release
memory barriers by using locks. However, the lock portion of a mutex only
has to contain an acquire barrier, which happens to be the _wrong_ type for
producing an object. I am referring to the following snippet of your code:

<Leigh Johnston thread-safe version of Meyers singleton>
___________________________________________________________
static T& instance()
{
00: if (sInstancePtr != 0)
01: return static_cast<T&>(*sInstancePtr);
02: { // locked scope
03: lib::lock lock1(sLock);
04: static T sInstance;
05: { // locked scope
06: lib::lock lock2(sLock); // second lock should emit memory
barrier here
07: sInstancePtr =&sInstance;
08: }
09: }
10: return static_cast<T&>(*sInstancePtr);
}
___________________________________________________________

Line `06' does not produce the correct memory barrier. Instead, you can try
something like this:

<pseudo-code and exception saftey aside for a moment>
___________________________________________________________
struct thread
{
     pthread_mutex_t m_acquire;
     pthread_mutex_t m_release;

     virtual void user_thread_entry() = 0;

     static void thread_entry_stub(void* x)
     {
         thread* const self = static_cast<thread*>(x);
         pthread_mutex_lock(&m_release);
         self->user_thread_entry();
         pthread_mutex_unlock(&m_release);
     }
};

template<typename T>
T& meyers_singleton()
{
         static T* g_global = NULL;
         T* local = ATOMIC_LOAD_DEPENDS(&g_global);

         if (! local)
         {
             static pthread_mutex_t g_mutex = PTHREAD_MUTEX_INITIALIZER;
             thread* const self_thread = pthread_get_specific(...);

             pthread_mutex_lock(&g_mutex);
00: static T g_instance;

             // simulated memory release barrier
01: pthread_mutex_lock(&self_thread->m_acquire);
02: pthread_mutex_unlock(&self_thread->m_release);
03: pthread_mutex_lock(&self_thread->m_release);
04: pthread_mutex_unlock(&self_thread->m_acquire);

             // atomically produce the object
05: ATOMIC_STORE_NAKED(&g_global,&g_instance);
06: local =&g_instance;

             pthread_mutex_unlock(&g_mutex);
         }

     return local;
}
___________________________________________________________

The code "should work" under very many existing POSIX implementations. Here
is why...

- The implied release barrier contained in line `02' cannot rise above line
`01'.

- The implied release barrier in line `02' cannot sink below line `03'.

- Line `04' cannot rise above line `03'.

- Line '03' cannot sink below line `04'.

- Line `00' cannot sink below line `02'.

- Lines `05, 06' cannot rise above line `03'.

Therefore the implied release barrier contained in line `02' will always
execute _after_ line `00' and _before_ lines `05, 06'.

Keep in mind that there are some fairly clever mutex implementations that do
not necessarily have to execute any memory barriers for a lock/unlock pair.
Think exotic asymmetric mutex impl. I have not seen any in POSIX
implementations yet, but I have seen them used for implementing internals of
a Java VM...

[...]

Thanks for the info. At the moment I am only concerned with IA-32/VC++
implementation which should be safe. I could add specific barriers to
my lock class when porting to other implementations (not something I
plan on doing any time soon).

/Leigh