Re: Threading issue in next standard

From:

"kanze" <kanze@gabi-soft.fr>

Newsgroups:

comp.std.c++

Date:

Fri, 25 Aug 2006 09:49:57 CST

Message-ID:

<1156510904.180905.102600@m73g2000cwd.googlegroups.com>

Earl Purple wrote:

Jiang wrote:

SuperKoko wrote:

I hope it'll be a library facility.
Languages having builtin threading tend to impose a single
very particular mean to do multithreading and it's easy to
reach the limits of those models.

Maybe this is true. But if this "single very particular
mean" helps, then do we really need the freedom for
threading?

For example, compared with

void foo()
{
    scoped_lock(mutex); // OK, RAII used here

    // access the resource...
}

, the following function bar

void synchronized bar()
{
    // access the resource...
}

is much clean and well controlled in my mind.

There are 3 locking issues here:
- A mutex. Covers all situations
- A critical section. Only one thread is allowed to access this thread
at a time
- An atomic section of code. This effectively means that all
other threads must wait while this block of code is completed.
With a genuine dual-processor there really are 2 threads
running at a time so this may be harder to enforce.

Don't confuse names with principles. Mutex, critical section
and atomic section (as you define it---I've never heard the term
before) all mean more or less the same thing. (Normally, I
would use critical section for the area of code being protected,
and mutex for the mechanism protecting it. Historically, too,
there were other ways of ensuring exclusion in a critical
section---on the processors I started with, I'd disable
interrupts, for example.)

There are also atomic operations, which work without using
external protection.

With a mutex, you have different mutexes but the same mutex
could be locked in multiple parts of the code which clash with
each other

I'm not sure what you're trying to say here. But I think your
point is that two different functions need to access the same
data, and thus use the same mutex. In the Java model of
synchronous functions, this works for member data, since all
member functions synchronize on the same data.

Of course, most of the time you use external synchronization,
you are updating several different objects, and need to ensure
that no other thread interrupts between the updates. Which
means synchronizing with the same mutex in functions of
different objects. (In at least one case, I ran the application
as if threads weren't preemptive. The application had exactly
one mutex, which was only released when a thread explicitly
wanted to allow other threads through---typically, when it
needed to wait for IO or something like that. It made thread
safety a lot, lot easier, and on a single processor machine,
total throughput was considerably higher than it would have been
with finer grained locking.)

Here programmers do not have to remember RAII,
well, if RAII is parts of the language. Compared
with lock/unlock, constructor/destructor is much
reliable, but why not make a futher step?

It's not a matter of programmers "remembering RAII". RAII is
there so that programmers don't have to remember to release
resources.

In the Java model, locks have to be handled by the language,
precisely because there is no RAII. A lock is a resource where
RAII is usually an important simplification.

I'm not really sure about the Jiang's point here. Locks at the
function level don't work well, because the granularity of
locking rarely corresponds to the granularity of a
function---you end up either holding the lock to long, or
twisting your design to make the functions fit the required
locking. And the difference between block level locking and
RAII seems very slim to me:

block locking:

    void f()
    {
        // ...
        synchronize mutexObject {
           // locked section...
        }
        // ...
    }

RAII:

    void f()
    {
        // ...
        {
            scoped_lock l( mutexObject ) ;
            // locked section...
        }
        // ...
    }

Block locking has the advantage that you don't need to invent a
name for the scoped_lock object. On the other hand, in the RAII
solution, you frequently don't need the inner block, and in
some, very simple cases, you can even write things like:

    scoped_lock( mutexObject ), whatever() ;

(This works as long as whatever is a simple expression. Whether
it is a good idea on readability grounds is another question. I
can't say that I really like it, but the possibility does
exist.)

More generally, the RAII idiom also allows things like:

    std::auto_ptr< scoped_lock >
    f()
    {
        std::auto_ptr< scoped_lock > l ;
        // ...
        return l ;
    }

Posession of the lock is not tied to function scope.

The problem is, even we have a thread library, lots of
low level details must be handled by our programmers.

Low-level details would be handled by the writers of the
standard libraries rather than by the compiler manufacturers,
who may be different. I would prefer it if my code were
compiled directly in machine code rather than the compiler
having to keep compiling a standard library of templates time
and again.

Well, everything ultimately compiles to machine code. I'm not
quite sure what your point is here---it sounds like you are
arguing against the library solution (since in the library
solution, the compiler would have to keep compiler a standard
library again and again).

Also, if the library were to be written on a UNIX system to
wrap pthreads then would it use a header file <cpthread> to
avoid name-clashing or would the entire pthread library be
automatically "included" into your source when you weren't
actually using it directly. Similarly on Windows with their
standard library and on any other platform that has
multi-threading.

I don't get this. Why would the compiler have to automatically
include anything, any more than it does for e.g. new or typeid
(both of which depend on "library" functions or classes)?

Just consider this, the read_write_mutex in boost.thread
was removed due to problems with deadlocks.

No matter how good a threads library you write, it is up to
the application-level programmer to ensure that there are no
deadlocks. The purpose of a library is to aid good
programming, not to prevent bad programming. Of course you do
put in some protection so that programmers won't do the wrong
thing.

The potential problem of read-write locks is writer
starvation. If there are always readers about, the writer may
never get a chance to write.

That depends on how they are implemented. Normally, I would
expect that as soon as a writer is waiting further read requests
suspend until it has finished.

This can be prevented by adding an addition mutex - the reader
and writer both must acquire the mutex before acquiring the
read-write lock, but there is a difference. The read releases
the mutex immediately on acquiring it, the writer holds onto
the mutex until it gets the read-write lock. That means the
mutex remains locked if there is a writer waiting to write,
and new readers cannot read although the existing ones may
continue reading. In practice this is not a bad thing as
although the readers may wait a bit, they will eventually read
the most updated information.

There's no reason to be that complicated. If an implementation
doesn't provide read/write locks, they can easily be simulated
with a single conditional. Regardless of the desired locking
policy.

Now if you have to implement a library on top of what you
already have, then on a POSIX system you would probably
implement read-write locks on top of pthread_rwlock_init etc.
There is, as far as I'm aware no defined behaviour as to
whether writers get priority in such a library so to be safe
you may well add it in to your own.

The defined behavior depends partially on the implementation. If
the Thread Execution Scheduling option is supported, read-write
locks are required to work correctly; otherwise, it is
implementation defined. (Solaris documents them as working
correctly, and I suspect that this is true on most
implementations---pthread_rwlock_rdlock will block if there is a
thread waiting to write, even if it doesn't have the lock.)

If those best experts can not handle the low level issues
correctly (although it's rare case, of course), IMHO, I
would like to use the single particular, but controllable
one. Freedom here does not benefit too much for me.

The issues with boost is that they are trying to write a
library for all systems, but different systems do things
differently. On Windows there is no concept of rwlock, you
have to implement the whole thing yourself using just mutex
and semaphore etc.

Which shouldn't be that difficult, since they have implemented
conditional variables (which is what you need to implement a
read-write lock).

That would be the same for any "standard" library for threads.
A language feature would mean that your code would be directly
compiled to the relevant machine code.

The distinction isn't that black and white, consider typeid and
std::type_info. I don't know what C++ threading will look like
in its final version, but I do know that it will have some
language support, and that there will be library parts as well.

--
James Kanze GABI Software
Conseils en informatique orient9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S9mard, 78210 St.-Cyr-l'cole, France, +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]