Christopher Merrill wrote:

One of the big issues is synchronization, especially
of memory reads and writes. For example, if we have

int shared_value = 0;
Mutex shared_value_mutex;

void thread_a() {
   shared_value += 10;

void thread_b() {
  shared_value += 20;

Maybe I'm misunderstanding the question, but if you define
shared_value as volatile int instead of just int, doesn't that
instruct the compiler to never cache shared_value in a

It instructs the compiler to take some implementation defined
precautions. In most of the compilers I use, it does exactly
what you say. And no more, which makes it pretty useless with
regards to thread safety (or much of anything else, for that
matter---volatile isn't sufficient even for memory mapped IO on
a Sparc, at least not as implemented by Sun CC or g++).

You shouldn't forget, either, that this is a simple example. In
real life, the shared_value might be a much more complex data
structure, and the update might involve many memory accesses.
It would be necessary for all of the accesses to volatile
qualified. And volatile, implemented in a way that has meaning
in a multithreaded environment, has a very high cost,
multiplying access times by 5 or more; this would be
unacceptable (and unnecessary) for most applications.

Or is there another way this simple mutex scheme can be

With Posix synchronization methods (and I'm pretty sure the same
holds for Windows), you don't need volatile here. Posix
synchronization methods guarantee sufficient memory
synchronization for this to work.

The following might be of interest to the OP:

Not much help to me---they don't compile on my platform:-).

I'm not sure if the use of volatile in them is necessary; I
suspect that Microsoft would guarantee that their compiler
assumes access in embedded assembler, and so won't optimize
accross it. The critical part which makes these functions work
(if they do work) is the synchronization guarantees of the lock
prefix in Intel's IA-32 architecture. On my own platform (Sun
sparc), I have one or two similar routines, which also use
special instructions (membar, on a Sparc) never generated by the
compiler. (Arguably, accessing an object through a volatile
qualifier should generate such instructions. It doesn't with
the compilers I have access to.)

