Re: C++ Threads, what's the status quo?

From:

"James Kanze" <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

10 Jan 2007 11:04:49 -0500

Message-ID:

<1168436570.088461.282130@i39g2000hsf.googlegroups.com>

Le Chaud Lapin wrote:

James Kanze wrote:

You mean, that the programmer is using a compiler, and doesn't
have any real guarantees with regards to what it does.

Yes I do. The compiler manual says, "This library is for
single-threaded applications. That library is for multi-threaded
applications." I use the library for single-threaded applications in
single-threaded code. I use the library for multi-threaded
applications in multi-threaded code.

And what about the compiler, and the code it generates? (And of
course, just saying that the library is for multi-threaded
applications doesn't say anything. But I assume you are using
this as a shorthand way of saying that the library defines a
contract for a multithreaded environment.)

But that's just my point: you need these guarantees: from the
hardware, from the OS, from the compiler, and from all of the
libraries you use. If any one is missing, you can't write
multithreaded code and be sure that it is correct.

The current situation, of course, is that you often do have a
set of guarantees. But the exact guarantees differ from one
combination of hardware, OS, compiler and libraries to the next,
so any multithreaded code is unportable. And it's often
difficult to find what exactly is guaranteed anyway: in the
Posix standard, the guarantees concerning memory synchronization
are buried away in some secondary text, and I've yet to find
them at all for Windows (but I haven't looked that hard, since
it's not a relevant platform for the type of things I develop).

Except if one of the writes migrates across a spin-lock.
(Spin-locks are only guaranteed if special hardware instructions
are used on a Sparc---or a PC. Hardware instructions that are
never generated by Sun CC or by g++---or by the current version
of VC++.)

Uh, yes they are.

Only using __asm__ under g++, or something similar with VC++.

People who write device drivers using them all the
time.

People who write device drivers use assembler or very special
compiler extensions, at least for parts of their code.

(Also, I would hope that very few, if any, device drivers today
use spin-locks. Blocking the entire system until your IO
finished may have been acceptable under CP/M, but it certainly
isn't under Windows or Unix.)

In fact, they are used so often, Microsoft make special library
functions just for spin-locks. Microsoft uses spin-locks in its own
implementation of what they call critical sections.

So that a critical section will be slower than a mutex?

I just looked at the documentation. It spins (which isn't quite
the same thing as using a spin lock), and then, only in very
special cases. The documentation explicitly says that it will
NOT spin on a single processor machine, ever.

And of course, this sort of code is written (at least partially)
in assembler, so the fact that the compiler cannot generate the
necessary instructions is irrelevant.

First you have to define what that part is. Suppose that the
library writer says that you need to manually acquire a lock
before calling malloc (or operator new)?

I would ask the library writer why he did that when it does not make
sense. It is better to have a thread-safe new() which I use in all my
multi-threaded code.

Certainly. So in fact, you just "assume" that the library (and
the compiler) gives you the guarantees you want. Whereas I've
actual experienced the fact that different compilers and
libraries give different guarantees, and that there have been
compilers (not so long ago) with which you really couldn't use
for multithreaded applications.

Don't forget that multithreading on mainstream machines is very
recent. (Of course, in many ways, it resembles what
multiprocessing was back when I started programming, when we
didn't have MMU's, virtual memory or even protection.)

What does "meant to be used in a multi-threaded application"
mean? For, say, tmpnam, or localtime? Or operator new, or
std::allocator?

When a person sits down to write a library, and it is after the year,
say, 1985, he should be thinking about whether that library will
operate in a single-threaded environment, a multi-threaded environment,
or both.

      [Just a nit, but I'd disagree about the year. In 1985, the
     Microsoft OS was MS-DOS, and you certainly didn't have to
     worry about multithreading there. G++ didn't support
     threading in its generated code until at least 3.0, which
     only appeared in mid-2001, and there's no point---nor any
     way---to make a library thread safe if the code the compiler
     generates isn't thread safe.]

And especially someone who writes a function that read/writes
static global state. There is nothing inherently wrong with that, but
the library writer should not go around peddling it as being
thread-safe. Obviously it is not.

You're missing the point again. You need guarantees, from the
library and from the compiler. Obviously, most compilers today,
at least for mainstream Unix and Windows, do support
multithreading, and their libraries do as well. That is to say,
they define a contract with you with regards to multithreading:
you meet your end of the contract, and they will ensure that
your code will work.

The problem is that the exact terms of the contract vary from
one compiler to the next, even when the set of supported
system-level primitives is identical. I've had to fix programs
that worked with Sun CC, and failed with g++; had the original
programmer used g++, and its set of guarantees, the situation
would be the reverse.

The point of standardization is to give a common set of
guarantees, that you know to be available everywhere.
Individual compilers will probably still give additional
guarantees. If you're developping for a single
platform/compiler, you can take advantage of these guarantees
(in the same way a lot of programmers write code that assumes
that an int is 32 bits). If you want portable code, however,
you have a predefined set of guarantees that you know will work
everywhere.

The functions I cited weren't chosen at random. In the case of
the first two (tmpnam, localtime), Posix (and Sun CC and
g++---and, I'm almost sure, VC++ under Windows) says that you
cannot use them in a multithreaded program. They may be
standard C/C++, but the Posix contract doesn't extend to them.
And while the C++ standards committee hasn't gotten that far
yet, I rather suspect that it will follow existing practice, and
ban them as well. The functions may be part of the standard
library, but the usual guarantees for multithreading don't
extend to them.

The other two were chosen because the standard I normally use
for multithreading guarantees (Posix) doesn't mention them. At
present, I need a guarantee from the library, and in the case of
operator new, from the language as well. Guarantees I'd like to
see in the standard.

And since you mention new(), what do you think all of us Windows people
do when we need to call new() in a multi-threaded application?

You probably count on guarantees given by the compiler. Just
like us Unix people do.

That's my whole point. Just having a set of OS level primitives
is not sufficient. You need guarantees from the compiler (and
from all of the libraries you use).

Invoke
and hope that race condition does not manifest? No, we go to the IDE
options and change the drop-down menu from "Single Threaded Library" to
"Multi-Threaded Library". Recompile. All are happy.

      [Just another nit, but normally, the individual programmer
     doesn't make the decision as to how his programs are
     compiled. For better or worse, changing compiler options
     can result in incompatible binaries, so all of the compiler
     options are normally fixed once and for all by the
     integration team. I don't know the exact mechanisms for
     doing this under Windows (other than by making everyone use
     a Unix like environment like UWin or MSys), but it certainly
     doesn't involve individual programmers choosing in a menu.]

If the library implementation doesn't fulfil the contract, then
it is the library implementors fault. But for that to be the
case, you first need the contract.

Agreed. But the issues are a lot more complicated than you seem
to think. I know, from experience.

The problems you put forth seem to indicate otherwise. No offense, but
the cache synchronization issue, which I was well-aware of and
purposely avoided broaching so as not to let this thread degenerate
into a discussion on computer hardware, is one of the few legitimate
realms of uncertainty.

Saying that a C++ class is not thread safe because objects of the class
all operate on a global variable...that's obvious.

Now if you are saying that the C++ committee needs to structure there
libraries so that that they can be single-threaded and multi-threaded,
I agree with that, but that is the library, not the language proper.
And I have always said that this is a library issue more than anything.

What you started by saying is that all you need is a good set of
OS primitives. Which is just false. Both the language proper
and all of the libraries use must offer a full set of
guarantees, and you must write your code to conform to your end
of the contract they define. Today, you simply cannot write
portable multi-threaded code, and the problem goes far beyond
wrapping the OS primitives in a portability layer. The set of
guarantees offered by the compiler differs from one compiler to
the next. Currently, most compilers (associated with their
standard libraries) do not allow the use of localtime in
multithreaded code, for example, even though it is a very widely
used function. Posix based compilers say to replace it with
localtime_r; Windows based with localtime_s (and doesn't
explicitly say that localtime is forbidden, so maybe it does
work). Currently, many compilers (but not all) require external
synchronization when first encountering a static local variable.
Currently, almost every compiler I know handles thread
cancellation differently. (OK, that's an easy one: everyone
does it differently, so don't use it.)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient?e objet/
                    Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]