Re: Am I or Alexandrescu wrong about singletons?

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Wed, 31 Mar 2010 13:40:48 CST

Message-ID:

<bbd4bca1-2c16-489b-b814-98db0aafb492@z4g2000yqa.googlegroups.com>

On 31 Mar, 04:14, "Leigh Johnston" <le...@i42.co.uk> wrote:

"James Kanze" <james.ka...@gmail.com> wrote in
messagenews:da63ca83-4d6e-416a-9825-c24deed3e49f@10g2000yqq.googlegroups.com...

<snip>

Double checked locking can be made to work if you introduce
inline assembler or use some other technique to insert a
fence or a membar instruction in the appropriate places.
But of course, then, the volatile becomes superficial.

It is only superficial if there is a compiler guarantee that a
load/store for a non-volatile variable is emitted in the
presence of a fence which sounds like a dubious guarantee to
me. What compilers stop performing optimizations in the
presence of a fence and/or how does the compiler know which
variables accesses can be optimized in the presence of a
fence?

All of the compilers I know either treat inline assembler or an
external function call to a function written in assembler as a
worse case with regards to optimizing, and do not move code
accross it, or they provide a means of specifying to the
compiler which variables, etc. are affected by the assembler.

This is also the counter-example you are looking for, it
should work on some implementations.

It's certainly not an example of a sensible use of volatile,
since without the membar/fence, the algorithm doesn't work
(at least on most modern processors, which are multicore).
And with the membar/fence, the volatile is superfluous, and
not needed.

Read what I said above.

I have. But it doesn't hold water.

FWIW VC++ is clever enough to make the volatile redundant
for this example however adding volatile makes no
difference to the generated code (read: no performance
penalty) and I like making such things explicit similar to
how one uses const (doesn't effect the generated output but
documents the programmer's intentions).

The use of a fence or membar (or some system specific
"atomic" access) would make the intent explicit. The use of
volatile suggests something completely different (memory
mapped IO, or some such).

Obviously we disagree on this point hence the reason for the
existence of this argument we are having.

Yes. Theoretically, I suppose, you could find a compiler which
documented that it would move code accross a fence or a membar
instruction. In practice: either the compiler treats assembler
as a black box, and supposes that it might do anything, or it
analyses the assembler, and takes the assembler into account
when optimizing. In the first case, the compiler must
synchronize it's view of the memory, because it must suppose
that the assembler reads and writes arbitrary values from
memory. And in the second (which is fairly rare), it recognizes
the fence, and adjusts its optimization accordingly.

Your argument is basically that the compiler writers are either
completely incompetent, or that they are intentionally out to
make your life difficult. In either case, there are a lot more
things that they can do to make your life difficult. I wouldn't
use such a compiler, because it would be, in effect, unusable.

<snip>

The only volatile in my entire codebase is for the "status" of
my "threadable" base class and I don't always acquire a lock
before checking this status and I don't fully trust that the
optimizer won't cache it for all cases that might crop up as I
develop code.

I'd have to see the exact code to be sure, but I'd guess that
without an mfence somewhere in there, the code won't work on a
multicore machine (which is just about everything today), and
with the mfence, the the volatile isn't necessary.

The code does work on a multi-core machine and I am confident
it will continue to work when I write new code precisely
because I am using volatile and therefore guaranteed a load
will be emitted not optimized away.

If you have the fence in the proper place, you're guaranteed
that it will work, even without volatile. If you don't, you're
not guaranteed anything.

Also, at least under Solaris, if there is no contention, the
execution time of pthread_mutex_lock is practically the same
as that of membar. Although I've never actually measured
it, I suspect that the same is true if you use
CriticalSection (and not Mutex) under Windows.

Critical sections are expensive when compared to a simple load
that is guaranteed by using volatile. It is not always
necessary to use a fence as all a fence is doing is
guaranteeing order so it all depends on the use-case.

I'm not sure I follow. Basically, the fence guarantees that the
hardware can't do specific optimizations. The same
optimizations that the software can't do in the case of
volatile. If you think you need volatile, then you certainly
need a fence. (And if you have the fence, you no longer need
the volatile.)

--
James Kanze

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]