Re: History of and support for std::basic_string::back()

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Wed, 7 Aug 2013 06:23:49 -0700 (PDT)

Message-ID:

<554aefd5-ff4e-4126-a037-cdeb651fd176@googlegroups.com>

On Tuesday, 6 August 2013 19:35:50 UTC+1, Paavo Helde wrote:

=D6=F6 Tiib <ootiib@hot.ee> wrote in
news:8b8b55e9-4346-4a2c-b099-8d4ee7595826@googlegroups.com:

On Monday, 5 August 2013 23:23:20 UTC+3, Paavo Helde wrote:

=D6=F6 Tiib <ootiib@hot.ee> wrote in

[...]

Ok, I see your point now. You are saying that the C++ standard library
can customize itself according to the hardware (while compiled or
installed or loaded into a process). An interesting idea, but seems very=

tricky in practice. Do you have any links to any C++ library
implementations actually doing this?

It's not dynamic customization, and it isn't only according to
the hardware. Different programs use strings in different ways.
CoW can be an important optimization for many of them. An
implementation has to "guess" what it thinks is the best
solution for what it thinks are the most typical utilisations of
the class. CoW is certainly the right choice for some uses, and
if the library authors think that those uses represent the
majority of its clients, they will (or should) use CoW.

The problem is that the string header is typically header-only, which
means it is compiled into application code. In libstdc++, the atomic
operations on refcounter are directly in the header file. On the other
hand, this issue affects ABI, so all code loaded in a process must agree=

whether it uses CoW or not.

All libraries must agree. You can break both g++ and VC++
simply by changing a few options in your compiler. Which is
a shame, but that's the current situation.

FWIW, LLVM seems to have given up CoW strings. From
http://libcxx.llvm.org/ : "For example, it is generally accepted that
building std::string using the "short string optimization" instead of
using Copy On Write (COW) is a superior approach for multicore machines=

(particularly in C++11, which has rvalue references).

Saying something is "generally accepted" is often an excuse for
not doing it right. (I'm not saying this is the case here; LLVM
might feel that their customers are best supported by short
string optimization. But globally, short string optimization
only applies when the client code uses a lot of short strings.
(And even then, it depends on how he uses them---at least as
implemented in VC++, it requires an if for every access. Which
means that code which does a lot of indexing into strings will
run slower.)

With regards to the assertion: it is generally accepted that
implementing CoW correctly using atomic counters (rather than
mutexes) requires a great deal of skill, and that it is easy to
get wrong. It has nothing to do with what is better for the
users of the library, and everything to do with the fact that
implementing lock free algorithms requires special skills, which
often aren't (or at least weren't) present in the teams writing
libraries. The argument against CoW is that it is too easy to
get wrong. (The g++ implementation has one small bug, for
example, although I'm willing to bet that no one has ever
actually encountered it.)

Breaking ABI compatibility with old versions of the library
was determined to be critical to achieving the performance
goals of libc++."

A lesson taught by experience, no doubt.

Now when CoW is forbidden even on cases when it is most efficient then
developers are encouraged to deal with it. They will. For example they
port CString of Microsoft ATL and use it instead of using std::string.
Or something else like that for single threaded embedded platform.
Then standard is broken like James said because ... come on ... it is=

platform-specific optimization of string of characters. Such mundane
things should not be concern of programmers to care.

The C++ standard library is supposed to provide a good general purpose
implementation of all its features. There are always specific corner
cases where it does not work, there is no silver bullet for everything.=

It seems however that massive multithreading will become the new norm,
not a corner case. And a CoW string class looks exactly like an extra
custom library for a specific usage case (large strings, copied often),=

probably needing extra care in thread passing.

Actually, it is the small string optimization which seems to be
the corner case. For most applications I've seen, it is
a pessimization.

The C++ standard is supposed to provide implementers enough
liberty to implement what they think best for their user
community. If some implementer things that systematic deep copy
(with or without sso) is best for their community, perhaps
because there is no cheap atomic counters, then they should be
free to do so. If they think CoW is best, they should have that
liberty as well.

--
James