Re: C++ Threads, what's the status quo?

From:

Zeljko Vrba <zvrba.nospam@gampen.ifi.uio.no>

Newsgroups:

comp.lang.c++.moderated

Date:

13 Jan 2007 04:45:48 -0500

Message-ID:

<slrneqh4ik.eo4.zvrba@gampen.ifi.uio.no>

On 2007-01-12, Seungbeom Kim <musiphil@bawi.org> wrote:

One problem is that such a declaration is already a valid one with
different meaning: it says the function returns a 'volatile int'.

Yes, this occured to me shortly after I had sent the message. Barring
volatile code-blocks (btw, I forgot to write that compiler would be
forbidden to reorder statements within volatile{}, not just across) and
functions, what about

#pragma volatile(function_name)

This opens the question of "which function" if it is overloaded, so
the compiler would force that the function is unambiguous in that
compilation unit. Anyway, to return to the topic.. my thoughts on
multithreading in *any* language:

The variety of different CPUs and OS-es makes threading inherently
nonportable. Compared to hand-coded assembly for the target machine,
the compiler has two opportunities to mess up (what seems to be
correctly-written) code:

1. reordering program statements: synchronization mechanisms critically
   depend on correct sequential execution of statements
2. nonavailability of atomic register/memory and memory/memory operations on
   the target architecture (eg. it would be convenient to be able to assume
   that *x += 3; will generate atomic RMW instruction when available)

"One size fits all" C and C++ standards have resulted in a heap of various
requirements that, when unfulfilled, result in undefined behaviour (a trivial
example is behaviour of signed integer overflow). My opinion is that it
would be a mistake to try to unify the above two concepts in a standard
specifying a number of operational details such as "what happens
when two threads concurrently read a variable", or "how is *x += 3; handled"
(I guess the answer would be "UB if the target architecture doesn't have
atomic RMW instructions). I point out the latter two cases because, from
current posts, I have a feeling that this is the direction where the efforts
are going..

The compiler needs to provide only two mechanisms:

1. A mechanism that lets the programmer specify that a sequence of statements
   shall be executed in the exact order in which these statements have been
   specified. Ambiguities such as f(g(), h()) shall be flagged as error
   and the programmer shall be forced to use intermediate variables.

2. A mechanism to specify that an access to a variable shall be atomic.
   Thus, if a global variable x is to be used from two threads in expression
   like x += b; then it would have to be declared like something as
   "atomic int x;". If the target architecture doesn't have an atomic
   RMW instruction, this shall be an error. Assumptions for the correct
   execution of the program (availability of appropriate atomic instructions)
   are violated, therefore the program has to be rewritten. Much better
   than declaring the program as having UB.

Once these two mechanisms are in place, everything else can be provided by
a library. Questions like "when is *x += 3; commited to memory" should
_not_ be addressed by the standard. If a memory barrier is needed, it can
be provided as a library function, and the code rewritten such as

volatile { *x += 3; barrier(); }

The point is that it is the *programmer's* responsibility to decide whether
the barrier is neccessary, not the compiler's.

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]