Re: Thread-safe reference counts.

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Thu, 3 Apr 2008 03:02:24 -0700 (PDT)

Message-ID:

<ef09c3e0-b8a1-4634-8eec-e0e713c199db@a70g2000hsh.googlegroups.com>

Chris Thomasson wrote:

"James Kanze" <james.kanze@gmail.com> wrote in message
news:d294f005-51b1-4c40-bf5b-cb091909bb90@u69g2000hse.googlegroups.com...
On Apr 2, 6:49 am, "Chris Thomasson" <cris...@comcast.net> wrote:

"James Kanze" <james.ka...@gmail.com> wrote in message

news:1e8d3d15-c2e1-4968-9dff-e505d4ae77fe@m36g2000hse.googlegroups.com=

....

On Apr 1, 5:58 am, "Chris Thomasson" <cris...@comcast.net> wrote:

"James Kanze" <james.ka...@gmail.com> wrote in message

news:1d6f13f6-f217-4609-8cf8-1d0226aea1d1@s50g2000hsb.googlegroups.=

com...

On Mar 31, 4:16 am, "Chris Thomasson" <cris...@comcast.net> wrote:=

[...]

My only point is that, IMVHO of course, GC can be a wonderful
tool for all sorts of lifetime management schemes. This due to
the fact that it can inform the management protocol when an
object is in a quiescent state.

But you still haven't explained what you mean by "an object in a
quiescent state".

[...]

http://dictionary.reference.com/browse/quiescent%20

An object is at rest and has nothing to do.

Including being terminated?

Calling the objects destructor is fine in this state.

That's not what I asked.

I think we're talking at cross purposes. I am using the word
terminated intentionally, to try to break the link with the C++
concept of destructor. An object which has indefinite lifetime
doesn't need termination, so there's no question there that the
object has nothing to do. If an object has definite lifetime,
however, terminating its lifetime will normally result in it
doing something.

There's also a sense that most objects I deal with are at rest
and have nothing to do most of the time; they only wake up and
have something to do in response to some specific external
event.

(I.e. termination is a no-op for the object.) If
termination is not a no-op, then it still has something to
do.

When the counter has dropped to zero, the object can be
destroyed, reused, cached, ect.

You're missing the point. If the object has a determinate
lifetime, and something occurs to make that lifetime end, then
it must be terminated. Whether it is still accessible or not.
Later accesses are a programming error, but delaying termination
would also be an error. Depending on other design
considerations, this may be handled by putting the object in a
specific, invalid state, and signaling the error on next use, or
by some form of the observer pattern, notifying all clients of
the objects demise.

Otherwise, the question isn't so much whether it has
something to do or not, but whether it can still be used by
other objects or not.

No dynamic object should be able to acquire a reference to an
object whose reference count is zero.

And if I don't use reference counting?

An object may or may not be reachable. If the object has an
indefinite lifetime, the fact that it is no longer reachable
means that we can reuse its memory. If the object has a
definitely lifetime, and that lifetime has not ended, the fact
that it is no longer reachable is an error in the program; if
that lifetime has ended, the fact that it is reachable is
irrelevant, since it cannot be used.

A Proxy GC will call an in-quiescent state callback function
for an object when it determines that said object has quiesced
(e.g., unreachable). This is analogous to a reference counting
algorithm dropping the count to zero and subsequently
notifying the application via. callback. Imagine if
shared_ptr did not call dtor, but called a function that
allowed an application to decide what to do.

Boost::shared_ptr does. I use it to release locks, for example.

Of course, in such cases, there are other pointers to the object
as well.

It can call the objects dtor, or cache it, or immediately
reuse it, whatever, the object is quiescent.

But those are all implementation details. Generally speaking,
if we use garbage collection:

-- if the object has an indeterminate lifetime, garbage
    collection handles it perfectly---we don't need any
    additional code, and

-- if the object has a determinate lifetime, garbage collection
    allows protection against errors:

     o since the memory won't be reused as long as it is
        reachable, we can set a flag when the object is
        terminated, and test the flag in each use of the object,
        triggering an error in case the flag is set, and

     o if the garbage collector supports finalization (i.e.
        calls a function when the object ceases to be reachable,
        before recycling the memory), we can verify that the
        object has been correctly terminated before it became
        unreachable.

If not, then it probably needs some sort of explicit
termination, in order to inform the other objects that it is
no longer usable (and of course, whatever event made it
unusable should trigger termination, and this notification).
If so, then it lives on forever. Conceptually, at
least---when no other object can reach it, it's memory can
be recycled for other uses.

It can be reused, cached, the dtor can be called, ect.

Nominally, except for memory management issues, there's no
reason for such an object to have a destructor. If we consider
conceptually infinite memory, and that allocation and
deallocation are not observable behavior, then the fact that an
object with indeterminate lifetime is no longer reachable has no
logical effect on the program. The object just "disappears".

When an object reaches that point in its lifetime it can
decide to safely destroy itself and/or safely reuse/cache
itself for later resurrection and re-initialization;
whatever....

Why does the object have to decide?

The programmer who creates the logic can decide. E.g:

void object_quiescent(object* const _this) {
  // you can call dtor; delete _this
  // you can cache; object_cache_push(_this);
  // the object is unreachable indeed.
}

The whole point is that for some objects, such lifetime issues
are irrelevant, at least from the design point of view, and
doing anything is extra work for the programmer. And for other
objects, such lifetime issues are very relevant, but they are
totally independent of reachability; the object (or some owning
object) is reacting to an external consideration which
determines the lifetime, and the object's lifetime must be
terminated, immediately, regardless of reachability. The end of
its lifetime is part of the observable behavior of the object.

Or perhaps more to the point: why does the object have
nothing more to do: because it has reached a state from
which it can do nothing more (and so probably requires
explicit termination), or because it normally only has
something to do as a result of requests from another object,
and no other object can reach it. In the later case, of
course, that state is irrelevant to the object; it's
exterior to the object, and the object (normally) has no way
of knowing, nor should it.

The fact that the GC or reference-count can infrom the program
logic that an object is not able to be reached is valuable
information to any lifetime management scheme which deals with
dynamic objects.

I disagree (but the problem may be with regards to what we mean
by "lifetime"). If lifetime is relevant, then its termination
is observable behavior, which must occur at a specific instance,
independently of whether the object can be reached or not
(except that if it cannot be reached, there is no way to inform
it that it must terminate). If lifetime is not relevant (i.e.
termination has no observable behavior), then there's no point
in informing the object about it.

In the first case, garbage collection will not reap the
object if there are any remaining pointers to it, even if
its lifetime has ended; this allows some additional error
checking. In the second case, garbage collection can be
said to play an enabling role; without garbage collection,
somehow, the fact that the object has become unreachable
must be determined manually, so that the object can be
freed. (In many cases, some form of smart pointer will do
the job adequately. In a few, however, it is more
complicated.)

A quiescent-state is like a GC determining that an object can
be reaped, but informing the application and letting it decide
what to do. It can call the dtor, or reuse, ect.

But I really don't see the need, conceptually, except in some
very special cases.

In RCU speak an object is in a quiescent-state after its
rendered unreachable and has been successfully deferred
through the callback system. All concurrently accessing
threads within the epoch will go through a quiescent-state.
The information baking an epoch can be as fine-grain as an
embedded per-object proxy reference count, or it can be
coarse-grain, per-cpu and/or per-thread; whatever. When an
epoch goes quiescent, all objects contained within it have
also quiesced.

I'm not familiar with this vocabulary, so I'll pass on it.

Check this out:

http://en.wikipedia.org/wiki/Read-copy-update

Does that make any sense?

I did enough web search to determine that it mean
read-copy-update. The concept seems related to some compiler
optimization techniques I've seen, in which the compiler
considers "values", rather than the variables that they contain,
but I didn't (and don't) have time now to study it in detail.
My first impression, however, is that we really are talking
about different things when we talk about object lifetime---I'm
talking about the actual design level objects, and you're
talking about specific "generations" or "values" of those
objects---a much lower level concept. That feedback from the
garbage collector can help in such low level optimization, I
have no doubt, but I tend to view such things as happening "under
the hood" for the application level programmer; he just sees his
design object.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34