Re: Are throwing default constructors bad style, and if so, why?

From:

Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org>

Newsgroups:

comp.lang.c++.moderated

Date:

Sun, 28 Sep 2008 15:41:15 CST

Message-ID:

<K7wtsq.1AF8@beaver.cs.washington.edu>

Dave Harris wrote:

SeeWebsiteForEmail@erdani.org (Andrei Alexandrescu) wrote (abridged):

But I hope you'll also agree that an object holding a scarce
resource should have a no-throw "enter disposed state" operation
that is performed during destruction and after resource disposal?

No. What's the point? After destruction the object can't be accessed
anyway.

Of course it can, via a dangling pointer. That is the problem.

The object's type is not well-defined; its vtable is zapped.

And that doesn't help either!

If
the object has a dispose() function, then it may be convenient to
implement the destructor in terms of it, but that is entirely a private
matter and it may be more efficient to implement the destructor more
directly.

Did you write "destruction" when you meant "disposal" again?

I meant both. The thing is, if you want to take advantage of GC and
deterministic resolution, there is that darn extra state that must exist.

(Also, although it would be nice if dispose() was always no-throw, I
think there are probably situations where the act of correctly freeing a
resource can fail.)

That's fine. All destructors should be able to throw anyway.

Do you also agree that that state should be checked for in all
of that object's member functions?

No.

I think it is OK if, for example, attempting to read from a closed file
yields undefined behaviour. In debug builds it should assert, but I don't
think that's the kind of check you had in mind. In release builds it
ought to do something reproducible, but the caller should not rely on it.
Using a disposed object is generally a bug.

I agree it is a bug. I just want to make the bug not trash the program
arbitrarily.

One thing that I'm not really getting from your posts is a strong sense
of the difference between validity at the language level and validity at
the application level, and I think this is an important distinction to
make. A disposed object has well-defined semantics at the language level,
but not necessarily from the application level. We use well-definedness
at the language level to detect bugs which lead to contractually
undefined behaviour at the application level.

Exactly. If the language provides well-definedness in all or most or
many cases, we can debug incorrect applications easier. If our incorrect
applications also become undefined at the language level, then we can't
count on anything in debugging the program.

Here is the thought that is at the back of a lot of my thinking. In a
non-GC environment, when you have finished with an object you must call
delete in order to avoid memory leaks.

Yah.

In a GC environment, there is an
argument for using dispose() instead, in order to get consistent
behaviour if the object is mistakenly accessed subsequently.

I am not so sure about that. "Subsequently" after what? In a GC program,
the programming model has it that all values live forever. There is no
"subsequently". If you want to put the object in some "empty" state at
some point, fine.

In
GC-agnostic code, it might be reasonable to test whether GC is actually
present to decide which to use, and to encapsulate this so the client
doesn't know whether they are disposing or deleting. We might have a
shared_ptr<> that either deletes or disposes, for example.

Why not make delete do dispose in a GC environment? (Not a rhetorical
question.)

So from this client's point of view, dispose and delete are similar and
almost interchangeable. Dispose is just a version of delete which takes
advantage of GC to get consistent behaviour in the face of bugs, but
might be silently mapped to delete (by shared_ptr) so we can't do
anything with a disposed object.

The only extra thing you need to do is to write your destructor so as to
put the object in a defined state post destruction.

File::~File() {
     if (!haendel) return;
     fclose(haendel);
     if (gc_in_action()) haendel = NULL;
}

This issue of how GC and non-GC code should interact is quite thorny.
When you advocate doing more with the post-dispose state, would you agree
this reflects a view which is GC-aware rather than GC-agnostic?

I do. People on the standardization reflector mailing list have
expressed quite a few solid concerns about GC-agnosticity.

I tried to simplify things and make that state identical with the
default-constructed state. It turns out that that is a bad idea.

I am not really up to date on current move-semantics thinking, but there
seems to be a strong analogy between the disposed state and the
moved-from state:

o Moved-from objects can, at minimum, be destroyed. This is also
   the minimum we need for disposed objects.

o If an object is assignable, then it remains assignable even after
   it has been moved-from. (Many applications of move semantics rely
   on this, eg for swap().) If we are providing dispose() as a client
   utility rather than just the GC equivalent of delete, then we need
   this for disposed objects too so "assignable" objects are uniform.
   Assignment is thus a way to explicitly resurrect a disposed object.

o And that's pretty much it for moved-from objects: their state is so
   indeterminate that the only things you can do with them is delete
   them or assign to them a whole new state. Disposed objects are
   similar; you can't do much with them as they doesn't have the
   resources they need for full functioning.

Ironically, disposed objects are slightly better off because we /know/
they are impoverished. We know they don't own resources; a moved-from
object can still own. This could make disposed objects a bit more
predictable; it makes sense to have an is_disposed() method, for example,
where it doesn't make sense to have an is_moved(). However, although such
could be useful for debugging I don't know if I'd make it part of a
"Disposable" concept.

On the face of it, x.dispose() is more or less equivalent to
move-constructing a temporary from x and then destroying it, leaving x in
the pilfered moved-from state. If you want to unify language features,
maybe this is an area to look at.

Yah, there is quite a lot of similarity. There is one other difference:
the destructor of a moved-from object will still be called. A zombie
will not have its destructor called.

Andrei

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]