Re: Threading in new C++ standard

From:

Szabolcs Ferenczi <szabolcs.ferenczi@gmail.com>

Newsgroups:

comp.lang.c++,comp.soft-sys.ace

Date:

Thu, 22 May 2008 09:32:05 -0700 (PDT)

Message-ID:

<403ee25b-c6d3-4fd2-975b-98f148180de8@79g2000hsk.googlegroups.com>

On May 1, 8:07 am, Owen Jacobson <angrybald...@gmail.com> wrote:

On Apr 25, 5:04 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:

...
No. Several people have pointed out several times where the
language addresses concurrency issues.

Oh yes, you are one of them who continously keep saying that C++0x
adresses concurrency issues at the language level but who fail to put
here just a single language element for it.

If the only "language element" your world-view will admit is a new
syntactic construct, then you're right and the new C++ standard does
not contain any language elements to support threading.

It is at least honest of you to admit that "the new C++ standard does
not contain any language elements to support threading."

However,
that's an extremely limiting definition, not shared by very many
people.

Do you mean hoi polloi or do you mean people who are experienced in
computer language design too?

A comprehensive memory model is *required* for correct threaded code,
and it's something C++ does not yet have.

The memory is orthogonal to a high level programming language. A
programming language deals with abstractions called variables which in
turn are mapped onto the physical memory. But memory does not appear
directly at the language level hence there is no need for any memory
model at the language level. The memory model is something Java folks
started to talk about for some strange reason and it become a trendy
topic to talk about it. However, in a well designed high level
language it is no needed for it at all.

At the most basic level, a
memory model is a set of rules dictating what writes are and are not
allowed to affect a given read.

So far it has nothing to do with concurrent activities.

In the following simple example:

struct foo {
short a;
short b;
};

foo a_foo;

a memory model provides hard and fast rules about whether or not a
read of a_foo.b is allowed to be affected by writes to a_foo.a, or to
a_foo itself.

Well, without any `memory model' the answer is obvious. No,
conceptually it is not allowed. When was it allowed? If the compiler
can figure out that it can optimise it, that must be transparent to
the language level. If it is not transparent, it is a bug in the
compiler.

I guess what you mean is an optimisation to sequential execution of
programs. On the other hand, it is a pre-mature optimisation with
respect to concurrent programs.

Note that this doesn't necessarily need to involve
threads in the definition: the rules will hold under all (valid)
executions of the program possible for a conforming implementation,
including multithreaded executions.

Ok. So you say it has nothing special to do with threading.
Consequently, this does not make a language multi-threaded or
concurrent, does it?

If there is a comprehensive memory model that allows it (as with the
one in the upcoming C++ standard), *then* a library can provide the
threading primitives that are correct with respect to that memory
model.

Can you explain that in detail with reference to the draft document
and to your example fragment below, please.

Without the rules a memory model provides, you can't state
that

struct bar {
  short a;
  short b;

  pthread_mutex_t a_mtx;
  pthread_mutex_t b_mtx;

};

bar a_bar;

void thread_a () {
  pthread_mutex_lock (&a_bar.a_mtx);
  a_bar.a++;
  pthread_mutex_unlock (&a_bar.a_mtx);

}

void thread_b () {
  pthread_mutex_lock (&a_bar.b_mtx);
  a_bar.b = 5;
  std::cout << a_bar.b << std::endl;
  pthread_mutex_unlock (&a_bar.b_mtx);

}

is either correct *or* incorrect if thread_a and thread_b are called
on different threads, because nothing guarantees that the write to
a_bar.a will not affect a_bar.b.

The compiler should guarantee that it must be correct. If not, it is a
bug in the compiler.

Anyway, you did not inform the compiler in your example that you mean
bar.a or bar.b as shared data. You intend them to be shared but it
remains only your intention for which you apply some library stuff.
The library stuff is transparent to the compiler.

Besides, I wonder how the so-called `memory model' prevents it. Can
you explain it in detail on this example? Nevertheless, the same
underlying mechanism can be applied if you have concurrency support
properly at the language level and that is at least a clear situation
for the application programmers. More clear than a weird memory-model.
For the language level solution see the example below.

*That's* the nature of the language support being added.

Just above in your post you noted that this kind of support is not
specific to threading. "Note that this doesn't necessarily need to
involve threads ..." Now you stress that *that's* it.

It's not
about syntax: it's about semantics and rules.

You can have semantics for something that has a syntax as well. In
languages we talk about syntax and semantics but not only about one of
them.

The tools for creating
and synchronizing threads are being added to the library because there
is no need to modify the language to support them,

There is a need, though hoi polloi do not want to realize it. With no
memory model whatsoever can you make the compiler check the most
important issues that a concurrent program needs: Namely, whether the
shared resources are accessed within Critical Regions only. You would
need language support for that. So there is a need, only you are not
aware of it.

and because
modifying the C++ language itself is fraught with peril.

It was exactly the situation when an object-oriented language has been
created out of the procedural language C. Just ask Stroustrup:

<quote note="from earlier message" location=
"http://groups.google.com/group/comp.lang.c++/msg/27a4737cbcd8ddcc">
Once C++ was a success because it could add object-oriented
programming concepts to a procedural language. Stroustrup himself
claims that it did not seem such a straightforward idea to take up
object-oriented programming: "all sensible people "knew" that OOP
didn't work in the real world: It was too slow (by more than an order
of magnitude), far too difficult for programmers to use, didn't apply
to real-world problems, and couldn't interact with all the rest of the
code needed in a system."
http://ddj.com/cpp/207000124

The situation is very similar now with respect to concurrency at the
language level. All sensible people "knows" that it is inefficient to
have it at the language level. All sensible people "knows" that you
need memory model and visibility concerns. You need a brave step for
the success, though. Otherwise C++0x will be yet another library-based
language.
</quote>

The language
is being extended to provide rules that allow the library to be both
portable and correct.

-o

I appreciate that you at least tried to point out something.

Let me tell you that if the Critical Region would be included at the
language level, the whole fragile memory model was not needed. The
example of Java shows that it is a problem for the application
programmers to understand something implicit issue like the so-called
memory model.

If you have Critical Region at the language level, it becomes so
simple:

struct bar {
  shared short a;
  shared short b;
};

struct bar a_bar;

void thread_a () {
  region(a_bar.a) {
    a_bar.a++;
  }
}

void thread_b () {
  region(a_bar.b) {
    a_bar.b = 5;
    std::cout << a_bar.b << std::endl;
  }
}

Now without any `memory model' the compiler can take care that
`a_bar.a' and `a_bar.b' cannot affect each other, since the compiler
can and must be aware that the fields being used inside different
Critical Regions. In the library-based approach the compiler cannot
have any information about it and hence the so-called `memory model'.

Furthermore, in this language level solution the compiler can make
sure for you that it will catch you if you try to access `a_bar.a'
outside a syntactic unit `region(a_bar.a) {...}'. This important
checking you cannot achieve with the library currently proposed in C+
+0x.

Best Regards,
Szabolcs