Re: Share .cpp and .h along projects

From:

"Doug Harrison [MVP]" <dsh@mvps.org>

Newsgroups:

microsoft.public.vc.language

Date:

Mon, 20 Aug 2007 14:52:24 -0500

Message-ID:

<b5sjc31vbiq2frqki9qse6gl46hg6ouiqn@4ax.com>

On Mon, 20 Aug 2007 09:57:14 -0500, "Ben Voigt [C++ MVP]"
<rbv@nospam.nospam> wrote:

"Doug Harrison [MVP]" <dsh@mvps.org> wrote in message
news:nefcc3hh5er85j8hdrklli59i553ve10tu@4ax.com...

On Fri, 17 Aug 2007 15:14:50 -0500, "Ben Voigt [C++ MVP]"
<rbv@nospam.nospam> wrote:

volatile std::vector* g_sharedVector;

...

{
   std::vector* pvector = NULL;

   // this is a spin-wait, efficient on SMP, but on a single processor
system a yield should be inserted
   while (pvector == NULL) {
       pvector = InterlockedExchangePointerAcquire(&g_sharedVector,
NULL);
   }

   // use pvector in any way desired

With this sort of busy-waiting, it's more important than ever that "any
way
desired" translate to "as short a time as possible".

Agreed. But that's true for any access to synchronized resources.

It's especially true when you present a method that uses 100% of the CPU
for an indefinite period of time.

// with VC2005, can use "g_sharedVector = pvector;" instead
InterlockedExchangePointerRelease(&g_sharedVector, pvector);
}

The sensible thing is to get rid of the pointers and use a
CRITICAL_SECTION
along with the vector object instead of this clumsy, inefficient, obscure,
limited alternative to the way people have been doing things since the
beginning of Windows NT.

That's a higher level of abstraction for doing exactly the same thing.

Common sense dictates using the highest-level abstraction available unless
there's a good, specific reason to do otherwise. If, when someone says
"mutex", as I repeatedly did, you think "InterlockedXXX", well, it's just
hard to understand why.

BTW, why _exactly_ did you use volatile in your declaration of
g_sharedVector? (Based on the declaration of
InterlockedExchangePointerAcquire, it shouldn't even compile.)

No answer? I really would like to hear what you think volatile accomplishes
here.

The compiler DOES apply those optimizations. If the code doesn't make
proper use of volatile and memory barriers to ensure that the correct data
is seen in other threads, then the code has a thread safety issue, not the
compiler.

I'll say it again:

<q>
You can't require people to use volatile on top of synchronization.
Synchronization needs to be sufficient all by itself, and it is in any
compiler useful for multithreaded programming. All you need is
synchronization.
</q>

As you've become fixated on memory barriers, I'll add that using
synchronization objects takes care of whatever memory barriers may be
needed. Most multithreaded programming is done in terms of mutexes and the
like, and thinking about memory barriers is not necessary when using
mutexes and the like.

Splitting functions into separate DLLs to prevent the optimizations is not
the right thing to do. It is fragile. For example, at one time simply
using two different compilation units within the same module would prevent
optimization. Now there is Link-Time Code Generation. Also, the .NET JIT
compiler does cross-assembly inlining and even native compilation can make
deductions and optimize across external calls using aliasing analysis.

As I've said a couple of times by now, "If the compiler could look into
the
DLL, there would have to be some way to explicitly indicate that
lock/unlock are unsafe for this optimization." Do you understand I'm not
saying that the DLL approach is the be-all, end-all solution to the
problem? Do you understand what I meant when I said, "By putting
WaitForSingleObject, ReleaseMutex, and others in opaque system DLLs,
correct compiler behavior for MT programming WRT these operations
essentially comes for free." That means as long as the compiler doesn't
perform optimizations unsafe for multithreading around calls to these
functions, it does not need to define a way to mark their declarations
unsafe. It also means you don't have to use volatile, because the compiler

You keep making the same false claim. Here, let me show you a variant of
your code that is going to fail with an intelligent compiler, no peering
into the DLL necessary:

namespace {
   // assume the address of x is never taken
   int x; // needs to be volatile
   mutex mx;
}

// None of these touch x.
void lock(mutex&);
void unlock(mutex&);
void g(int);

void f1()
{
  lock(mx);
  g(x);
  unlock(mx);

  lock(mx);
  g(x);
  unlock(mx);
}

void f2()
{
  lock(mx);
  ++x;
  unlock(mx);
}

As you stated "A compiler that can see into all these functions will observe
that none of lock, unlock, and g access x, so it can cache the value of x
across the mutex calls in f1." I tell you that because x has file-local
linkage and the address of x is not taken, aliasing analysis in current
compilers proves that none of lock, unlock, or g access x -- without seeing
into the functions.

And I'll tell you again, you're not thinking this through. As I've
explained several times already, for a compiler useful for multithreading
to apply this optimization, it would have to prove there is no way x is
reachable from lock/unlock. This means proving there is no way f2 can be
called as a result of calling lock/unlock. The compiler cannot prove this
without being able to see into lock/unlock. This is the basis for what I've
said about opaque DLLs.

To a large extent, this is not even a multithreading issue. It also applies
to single-threaded code.

(My example does assume that f1 and f2 are called sometime, somewhere. If
one of them isn't, it's not very interesting.)

Using DLLs to inhibit optimization is broken, Broken, BROKEN!

Then stop bellowing and (a) Demonstrate that it doesn't work, and (b) Show
why the compiler doesn't perform unsafe optimizations around
WFSO/ReleaseMutex, EnterCriticalSection/LeaveCriticalSection, etc.

Adding "volatile" to the declaration of x fixes the problem.

Except that you cannot require people to use volatile on top of
synchronization.

must assume these functions can affect observable behavior involving the
objects you want to needlessly declare volatile, which as I've already
noted, is a huge performance killer plus completely impractical to use for
class objects.

It is *not* a performance killer when used correctly. Look at my original
example above and note that pvector is not declared volatile, only the
shared pointer is. Within the critical section all optimizations are
possible.

Before you make claims about your use of "volatile", answer the question I
posed last time:

BTW, why _exactly_ did you use volatile in your declaration of
g_sharedVector? (Based on the declaration of
InterlockedExchangePointerAcquire, it shouldn't even compile.)

This is an important question for you to answer in detail.

Using "volatile" is the only way to make code robust in the face of
improving optimizing compilers, and as a bonus, it is part of the C++
standard.

That's really quite funny. The C++ Standard does not address
multithreading, and it was recognized long ago that volatile is not the
answer or even a significant part of the answer. You might begin to
understand these things I've been talking about if you'd take the advice I
gave you a couple of messages ago:

<q>
You should google this group as well as comp.programming.threads for past
discussions on "volatile".
</q>

I've read several of those threads, some of which are in the hundreds of
responses.

Gee, I wonder why they get to be so long? :) FWIW, I'm not the only MVP who
has said things like "volatile is neither necessary nor sufficient" for
multithreaded programming using synchronization objects like mutexes. Of
course, we're getting this from people like Butenhof, who played a big part
in the pthreads standard.

Part of the problem is that MS has yet to publish a formal set of memory
visibility rules like POSIX did years ago, or I would have pointed you to
that. This leaves me to argue from experience writing MT code in VC++ and
also the fact that it would be a colossal blunder not to follow the POSIX
rules, which specifically do not require volatile on top of
synchronization. Also, I cannot recall ever hearing of a bug resulting from
not using volatile on variables consistently accessed under the protection
of a locking protocol involving CRITICAL_SECTIONs, kernel mutexes, and
other Windows synchronization objects. I cannot recall any MS documentation
that says volatile must be used on top of synchronization. The MFC library
doesn't use volatile, nor does the CRT use volatile for things it protects
with CRITICAL_SECTIONs, such as FILE objects.

For all these reasons, I think I'm on pretty safe ground (to put it mildly)
when I say that volatile is not required when using synchronization. If you
still disagree, I ask you to produce a counter-example.

I think I'm on equally safe ground WRT what I've said about DLLs. If you
still disagree, I ask you to produce a counter-example.

--
Doug Harrison
Visual C++ MVP