Re: c++ question regarding exception safety

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sun, 9 Mar 2008 05:09:22 -0700 (PDT)

Message-ID:

<e6d05b5c-a9c9-4fff-a661-7e265a05c7c8@m34g2000hsc.googlegroups.com>

On 9 mar, 00:08, "Alf P. Steinbach" <al...@start.no> wrote:

* James Kanze:

On 8 mar, 16:34, "Alf P. Steinbach" <al...@start.no> wrote:

A good approach might be to study the Java example I linked
to. And I mean really study it. For example, try to answer
these questions:

  * What is the class invariant in that example?

  * Exactly how does the lack of deterministic destruction, in
    that example, influence the choice of class invariant?

OK. But the first thing I see is:

    public DbConnection () {
        //build a connection and assign it to a field
        //elided.. fConnection =
ConnectionPool.getInstance().getConnection();
    }

That's the [constructor], and the interesting part---the essential
part, in fact, has been elided. If the constructor throws an
exception if it cannot establish the invariants, then there can
be no zombie.

The class invariant may be a bit easier to see by examining
the following (Java) code snippet from the article:

   public void destroy() throws SQLException {
     if (fIsDestroyed) {
        return;
     }
     else{
       if (fConnection != null) fConnection.close();
       fConnection = null;
       //flag that destory has been called, and that
       //no further calls on this object are valid
       fIsDestroyed = true;
     }
   }

Which doesn't really say anything about the class invarient.
But that's not really the point, is it. We can guess about the
class invariant, but the real question in relationship to
zombies is what the constructor does if it cannot establish it.
If the class invariant is that the class maintains an open
connection (which is probably not an acceptable class invariant
in this particular case, since it can become invalidated during
the life of the object, even if the client code rigorously
respects the contract---but for purposes of demonstration, I'll
accept it), then what does the constructor do if it cannot
establish this invariant.

Implied by that code:

   boolean invariantHolds()
   {
       return
           fIsDestroyed ||
           (fConnection == null || isValidConnection( fConnection ));
   }

The possibility of (fConnection == null) just complicates the
picture.

Agreed. I'd probably have merged this and the bool
fIsDestroyed. Again, however, without seeing the constructor:
if the invariant doesn't allow it, then the constructor should
throw if it can't create a valid connection. If the invariant
does allow for it, then you have to take it into consideration.

Personally, in the context of this example (i.e. ignoring the
fact that the connection can become invalidated during the
lifetime of the object), I'd consider the class invariant:
    ! fIsDestroyed
    && fConnection != NULL
    && isValidConnection( fConnection )
Of course, fIsDestroyed is just a debug device, and isn't
conceptually part of the object to begin with. And in C++, you
might elimate the pointer, and make the connection a member
object. In which case, the real invariant is just
    isValidConnection( myConnection )

It seems to be due to the author not being sure whether
getConnection() signals failure by returning null or throwing
an exception.

It seems due to the fact that the author hasn't really decided
what his invariants are, and so doesn't know whether to treat
something as an invariant error (i.e. a fatal software
error---something which would trigger an assertion failure in
C++), or as a normal state of the object. Until we know this,
we can't really talk much about whether we might have a zombie
or not.

What is certain, of course, is that if the author decides that
having a valid connection is part of the class invariant, in
Java and in modern C++, he can terminate the constructor with an
exception, and the client code can never access an object which
doesn't meet the invariant.

Another possibility might be that the author envisions some
closeConnection() method in addition to destroy().

So you're coming around to my point of view: the author has
withheld critical information from us---information we need to
really discuss the issue further.

Bad design is bad design. Not defining the exact class
invariants before writing a single line of code is bad design.
(In this case, of course, the author "elided" many things, so we
don't know whether this is bad design, or simple elided
information.)

If we assume that getConnection() throws on failure, and
there's no additional closeConnection() method, then things
can become more clear.

OK. For purposes of demonstration, I'll accept the idea that
"has a valid connection" is part of the class invariant (if
you'll accept to pretend that the connection can't become
invalid prematurely---we're creating a somewhat artificial
example for purposes of demonstration; I think we both agree
that in real life, this particular case would present a some
additional complications which we are sweeping under the rug).

For in that case fConnection can't be null and the class
invariant reduces to

   boolean invariantHolds()
   {
       return
           fIsDestroyed || isValidConnection( fConnection );
   }

which can be rewritten, for clarity, as

   boolean isZombie() { return fIsDestroyed; }

   boolean nonZombieInvariantHolds() { return isValidConnection( fConnecti=

on ); }

boolean invariantHolds() { return isZombie() || nonZombieInvariantHolds=

(); }

I hope you're with me so far in this analysis,

The problem here is that you've slipped the term "zombie" in
with no explination.

I somewhat suspected that part of our problem might be with
definitions, rather than the underlying principles. My
interpretation of the "demo" program was that the intent was for
the destroy() function to terminate object lifetime. In which
case, all of the logic around fIsDestroyed is debug logic; *if*
the client code conforms to the contract, then it will never
call a member function with fIsDestroyed false, and when
terminate() is called by the system, fIsDestroyed will be true.
IMHO, the correct way of handling such debug code is with
assert(), i.e. in case of an error, you bring the system down
(because you no longer have confidence in the program). Java
doesn't support such, however, so you do what you can.

Note that in that case, of course, your "isZombie()" function
above always returns false.

IMHO, this is an important distinction. When I speak of a
zombie, it's something which may occur even when the client code
is correct, and conforms to the contract. For example, if the
constructor of this object doesn't throw if it cannot establish
the connection. It's a state correct client code has to deal
with (with the emphesis on *correct*).

What you seem to be getting at is something I've always called a
dangling pointer. Using a dangling pointer is an error in
client code.

I think that there are actually three distinct issues involved
here, and I find it clearer to give each a separate name:

-- If the constructor is unable to establish the invariant, but
    still leaves an accessible object, then we have a zombie.
    The solution to this is exceptions.

-- If the lifetime of the object ends (regardless of how, for
    the moment), but the object is still accessible, then we
    have a dangling pointer (or lvalue expression---but in
    practice, the problem will only occur with pointers or
    references). The solution to this is "don't do it".
    Seriously, the solution is that all concerned parties must
    be notified---there is no general solution in either
    language (although some types of smart pointers may help in
    specific cases).

    Note that Java and C++ are exactly the same in this regard
    (although some Java advocates like to pretend that dangling
    pointers can't exist in the language). With the one proviso
    that you *can* implement serious runtime checking for this
    in Java, but not in C++ (not even in C++ with garbage
    collection, since the dangling pointer can be to an object
    with automatic lifetime). It's a weak proviso, however,
    because in practice, Java programmers don't implement such
    checking, and Java doesn't provide anything you can
    reasonably do (e.g. like abort()) if you detect the error.

-- Some objects have very deterministic lifetimes, which must
    be terminated at a very specific instant (or as soon as
    possible---but I find the "very specific instant" to be more
    prevelent in my code). C++ provides an "official" language
    mechanism for this: the destructor---even better, when you
    can arrange for this deterministic lifetime to correspond to
    a scope, C++ will call the destructor automatically for
    you---you don't have to depend on the client code not
    forgetting. In Java, all you have is an ad hoc mechanism,
    and it's up to the client code to conform to the contract;
    in the case where lifetime corresponds to a scope, Java does
    have try/finally, which simplifies somewhat the client code,
    but it is still far from the convenience of C++-like
    destructors. When the lifetime doesn't correspond to an
    automatic scope, of course, you must terminate it explicitly
    in both languages (although C++ has the slight advantage of
    having a language sanctified "official" syntax for this; in
    Java, you never know whether you have to call dispose(), or
    destroy(), or what).

    In practice, if you go back some years, you'll find that
    most of the times an object needed automatic lifetime, it
    was only for memory management or for handling locking.
    Java added language based mechanisms to handle those. As
    we've evolved using C++, however, we (or at least I) have
    found that the basic principle can be very useful in a lot
    of other cases---I probably use it more often for
    transaction management (in the largest sense) than for
    either of the original uses. (But then, I use garbage
    collection---otherwise, I suspect that memory management
    would still predominate. And of course, I also use it for
    handling locks, but those uses are generally isolated in a
    very few higher level mechanisms, like a message queue.
    Whereas I use transaction semantics a lot---an object which
    "undoes" everything if the function "commit()" hasn't been
    called on it before the destructor is called.)

Anyhow, three separate issues, with three different names.

because there's no point going further without agreeing on the
above conclusion. Namely, that we have a constructor that
signals failure (not able to establish class invariant) by
throwing, that we have something that can reasonably be called
a class invariant that holds for any constructed object (I
find it more clear to refer to that something as a /meta/
class invariant, and reserve plain "class invariant" for what
the function nonZombieInvariantHolds() checks), and yet we
have a zombie.

Would you also call it a zombie in C++ is someone did:

    DbConnection* p = new DbConnection(...) ;
    // ...
    delete p ;
    p->...

If so, then I think we'll just have to agree to disagree on the
terminology. If not, what's the difference between this, and:

    DbConnection p = new DbConnection( ... ) ;
    // ...
    p.destroy() ;
    p. ...

in Java? The only difference I see is syntax.

Point of possible contention: where
fIsDestroyed might be set to true, and why.

Hint: it's not in the constructor.

The real point of contention is whether fIsDestroyed is
conceptually part of the object state, or whether it is simply
debugging code. IMHO, after p.destroy(), you don't have a
zombie object, you have a dangling pointer. I think you're
being mislead by the fact that Java doesn't have direct language
support for managing object lifetime, and C++ does. Where as I
don't see that as being a real difference---objects have
lifetimes, and some objects have very deterministic lifetimes,
regardless of what the language says about them. And a pointer
or a reference to an object which is no longer alive is a
dangling pointer---regardless of whether the memory which once
held that object is available for re-allocation or not.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34