Re: std::string bad design????

From:

"Le Chaud Lapin" <jaibuduvin@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

8 Jan 2007 15:47:22 -0500

Message-ID:

<1168286783.492654.120540@s34g2000cwa.googlegroups.com>

James Kanze wrote:

The problem is that you don't know what the classes do. You use
an std::map, for example; how do you know that it doesn't use
static memory (e.g. in its allocators) in the implementation.
Unless std::map is thread-safe, you cannot use it in a
multithreaded application.

There is a fundamental difference in expectations here. I do not
expect any state, not even a simple int, to be thread-safe unless I
make it thread-safe;

int x; // not thread-safe

Note that _any_ data structure is "thread-unsafe". That is to be
expected. If you have a global variable, and N threads operate against
it, and that variable is not protected, there will be contention.

If you write map<> so that it uses global variables, and you try to use
map<>'s in a multi-threaded application, you already know that there
will be problems. I have my own equivalent of map<>. In fact, I have
several of them, and none of them use global variables. Whether it was
necessary to use a global variable in the implementation of map<> is
another discussion, but this is about common sense. I do not think
anyone who has been writing multi-threaded applications, knowing what a
global variable is, will have any expectations otherwise.

You might want to check
http://www.sgi.com/tech/stl/thread_safety.html; I'm pretty sure
that this corresponds to the thinking of the committee with
regards to how thread safety will be defined in the library.
(Except some special cases, like maybe std::cerr, and almost
certainly std::exit().)

I will take a look, but there is no need. I can write a class Foo,
right now, make it so that it uses a global variable, run multiple
threads against it, and watch my program crash. I will not be
surprised in the least. I can also write a function Bar, make it so
that it uses a static local (global) variable, run multiple-threads
against it, and watch my program crash. I will not be surprised in the
least.

Note that the implementation of std::basic_string in g++ does
not give this guarantee, although I think that the developpers
consider this an error (The combination of circumstances
necessary to encounter a problem is extremely unlikely to occur
in actual practice. But at least some of the developpers of g++
feel like I do: unless the probability is 0, it's an error.)

I definitely agree that engineering should be deliberate and
predictable. I cannot imagine that the people who designed the
standard library did not have multi-threading in the back of their
minds while making the library.

Note that in g++ pre 3.0, every single constructor of
std::basic_string modified global state. As did the
implementation of throw. Presumably, if you created the
necessary mutex, and never created a std::basic_string (nor
called a function which might do so---and do you know for sure which
functions in the standard library create temporary strings)
without first acquiring the mutex, and wrapped every throw/catch
with a mutex, your code might work. (Then again, it might not,
because there was no guarantee that nothing else used static
data.)

Ah...how refreshing. I see convergence in our thinking forthcoming. :)

Microsoft's approach to this was to provide two libraries: one for
single-threaded applications. one for multi-threaded applications. It
is the same approach I take. Fortunately, 98%+ of my classes are
already "thread-safe", meaning, they do not use global variables as
part of their implementation unless they have to. I have found that
most classes can fit this model, except for things like random number
generators, or classes that require massive global state to help it,
like Integer::is_prime() which works fastest if it is allowed to
maintain a global static array of small primes, say those less than
60,000. But that array is declared const and never changes, so it is
immune to requiring a critical section (spin-lock with failover to
mutex).

But for your std::basic_string example, note that, if the
implementation of std::basic_string used a global variable, I would
never place the burden of supplying mutexes on the user of that
component. Again, I would follow Microsoft's approach, and provide a
library for single-threaded applications, and one for multi-threaded
applications. The single-threaded application library would not have
protection. The multi-threaded one would. This works very well today.

Unless a library is thread safe, it cannot be used in
multithreaded code. Note that just grabbing a mutex at the
start of every function, and releasing it at the end, is neither
necessary nor sufficient to make a library thread safe. To make
a library thread-safe, you must document the guarantees that are
given: thus, for example SGI (and everyone else) guarantees that
you can create two separate instances of std::vector in two
different threads, and use them, without external
synchronization.

I *definitely* agree with the spirit of this paragraph. No programmer
should ever be burden with throwing in mutexes to protect against
exclusivity issues that might or might not be imminent. I think where
we might disagree is in answer the question,

"What should be done about mutual exclusion issues?"

I feel that the double-library approach is optimal. If I did not state
this explicitly in my earlier posts, my apologies. I just presumed that
everyone who was writing multi-threaded applications was doing this, as
you cannot write them under Windows without doing this.

The double-library approach would not burden single-threaded
applications with unnecessary critical sections (not to mention extra
global state), and would still work for multi-threaded applications.

And of course, the compiler must also make similar guarantees.
If you're working on Intel architecture, for example, the
compiler only has a very few registers to work with, and will
spill to memory if the expression is sufficiently complicated.
Every compiler I know spills to stack, but there is nothing in
the C++ standard to require this, and if the compiler spills to
static memory, you can end up with code which may or may not
work, depending on the level of optimization.

I wrote in another response to your bringing up this issue that I would
find it very odd if a compiler writer solved the register overflow
problem by spilling to static memory. It is simply not necessary when
the stack is available. Furthermore, in general, it would break a
guarantee that we do have: support for recursion.

If F calls G, G calls H, and H calls F, then spilling to static memory
in F would result in a dead end. The stack has to be used.

Why don't you read what the experts are saying, instead of
arguing against strawmen of your own creation.

I am. The most I have seen so far, aside from the cache coherency
problem (which has been around for a long time), is that if two threads
operate against an unprotected global variable, there is a potential
for corruption. I did not say this. Others did.

Is it expectation that C++ will somehow develop some universally
applicable lock-free mechanism?

What, may I ask, gives you this idea? I've not even seen it
suggested.

I am trying to figure out what are peoples expectations of C++ beyond
providing a thread-safe library.

-Le Chaud Lapin-

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]