Re: A few questions on C++
Kai-Uwe Bux wrote:
James Kanze wrote:
Kai-Uwe Bux wrote:
James Kanze wrote:
On Sep 21, 10:58 am, Kai-Uwe Bux <jkherci...@gmx.net> wrote:
James Kanze wrote:
On Sep 19, 3:10 pm, "Phlip" <phlip...@yahoo.com> wrote:
D. Susman wrote:
[snip]
2)Should one check a pointer for NULL before deleting it?
No, you should use a smart pointer that wraps all such checks
up for you.
Why? What does a smart pointer buy you, if all it
does is an unnecessary test?
Don't forget, too, that most delete's are in fact
"delete this". And "this" cannot be a smart pointer.
Are you serious?
Yes. Most (not all) objects are either values or entity
objects. Value objects aren't normally allocated
dynamically, so the question doesn't occur. And entity
objects usually (but not always) manage their own
lifetime.
Most of my dynamically allocated objects are used to
implement container like classes (like a matrix class),
wrappers like tr1::function, or other classes providing
value semantics on the outside, but where the value is
encoded in something like a decorated graph.
The internally allocated nodes do not manage their own
lifetime: they are owned by the ambient
container/wrapper/graph.
That is one of the cases where "delete this" would not be used.
But it accounts for how many delete's, in all? (Of course, in a
numerics application, there might not be any "entity" objects,
in the classical sense, and these would be the only delete's,
even if they aren't very numerous.)
It's not just numerics. But numerics applications are
definitely a very good example of what I had in mind. I think
that a lot of scientific computing looks like this.
Yes. I think the difference is that you're using the computer
to calculate. I've working in a number of different domains,
but in all cases, while there was some calculation, the computer
was mainly being used to process large data sets in some
systematic and logical way. The calculations were only a very
small part of the application.
[...]
As you can see, it' just a trivial filter; and all the real
code is in the library. That, in turn, is templated for
flexibility. E.g., the matrix class is supposed to work just
as nicely with infinite precision integers, and an algorithm
picking out the maximal elements (with respect to some partial
order) from a sequence should be generic.
Presumably, too, the matrix class is very, very stable.
In at least some such use, maintenance isn't that important
either; once you've gotten the results from a program, you don't
use it any more. (That has been the case in the few somewhat
distant contacts I've had with such software; the basic library
is a constant---maintained, but highly stable---and the
individual applications usually run just once or twice. But my
contacts with this type of software are few enough that I doubt
they have any statistical significance.)
As you have figured, it is somewhat like number crunching
(except that I am dealing more with topological and
combinatorial algorithms, so enumerating all objects of a
given size and type is a typical thing that happens in my
code).
Now, with respect to huge applications, I see that templates
are an issue. On the other hand, I thought, that is what
nightly builds are for: You have a bug to fix, you locate it,
you add a unit test for the failing component that displays
the bug without using all the unrelated crap from the huge
ambient application; and then you work on that component until
it passes all tests. After a commit to the code base, the huge
application is rebuilt over night and all automatic tests are
run. Working on your component in isolation, you still have
short edit-compile-test cycles.
Nightly builds suppose that you can compile the entire
application, on all target platforms, overnight. That's not
necessarily the case.
Most of the places I've worked at do try and do a weekly build,
over the week-end, but I've worked on projects large enough that
even that required some optimization (linking on the servers,
compiling in parallel on the hundreds of workstations connected
to the network, etc.).
Where the problem really hits is the individual developer. Who
needs to run systematic unit tests for every small modification.
Touching a header which contains a template which he uses (in a
library) may trigger a recompilation of an hour or so, rather
than just a couple of minutes.
[...]
More to the point, I'm thinking of commercial applications. I
sort of think you may be right with regards to numerical
applications.
By "commercial", do you mean "software for sale" or "software
used in the sales department" :-)
Software used for commercial applications. Not just the sales
department, but yes, software dealing with external entities
such as customers, products or employees.
I agree that programs that act in complex environments and
have to respond to thousand different kind of events will use
objects to model the world they operate in (I am thinking of
transactions between banks, simulations, GUI, games, etc). On
the other hand, programs that perform highly complicated
transformations in batch mode are likely to be different.
Exactly. Except that in the commercial world, the programs
which perform transformations in batch mode are still usually
written in Cobol, not in C++:-). And even in batch mode, its
often relevent to think in terms of behavior of specific
entities.
This is, of course, what I mean when I speak of an object having
an explicit lifetime. A "CustomerOrder" doesn't belong to any
other entity in the application; it has an explicit lifetime,
based on external events.
That would go for most of number crunching, scientific
programming, compilers, symbolic computation, combinatorial
optimization, and so on. I expect the code for those to be
more similar to mine than to yours. There are programs for
sale in all these categories (but I would not expect a typical
sales department to make heavy use of a PDE solver).
The closest I've gotten to that is when I wrote a compiler.
Thinking back on it... It was long enough ago to be in C, but
even in C++, I think that yes, things like a parse tree would
have a lifetime which was managed by some external object or
condition; logically, perhaps, the parse tree might even have a
"automatic" lifetime, but since it's size and structure are
very, very dynamic, of course, dynamic allocation would have to
be used.
I think the difference is not commercial versus
non-commercial, but more whether your application is
event-driven or has the classical (ancient?)
parse_input....write_output format.
I think you've hit on it. There are doubtlessly exceptions;
classical batch applications which do some sort of event
simulation in their processing, or commercial batch applications
which implement business logic over business entities, for
example. But by and large, your characterization probable
holds.
Think of a smart pointer that does not interfere with life
time but helps with the typical problems when pointers are
used for navigation. E.g., you can wrap the observer pattern
into a smart pointer so that all those objects that have a
handle to a potentially suicidal one get notified just before
it jumps off the cliff.
In theory. In practice, it tends to be more complicated; the
smart pointer isn't sufficient, and once you've implemented the
additional stuff, it isn't necessary. Thus, for example, in the
observer pattern, the observable normally doesn't have a
"pointer" to the observer, but a container of pointers. And
when one of the observables commits suicide, not removing the
pointer from the container will result in a memory leak.
Some fifteen years ago, when I started C++, there was a lot of
discussion (at least where I was working) about relationship
management, and a lot of effort was expended trying to find a
good generic solution, so that you didn't have to write so much
code by hand, each time around. As far as I know, however, no
good generic solution was ever found.
[...]
Obviously, something like
std::vector<> won't use delete this for the memory it manages.
Something that primitive probably won't use a classical smart
pointer, either, but I guess more complex containers might.
I don't really like smart pointers there either.
It's not that I don't like them; when they are appropriate, I
don't hesitate using them. I don't like them being presented as
a silver bullet, as they so often are. Nor do I like the fact
that many people are suggesting that you should never use raw
pointers, or that there is one magical smart pointer
(boost::shared_ptr) that will solve all (or even most) of your
problems.
However, they are really handy in getting a prototype up and
running, which is a good thing during the design phase when
you are experimenting with the interface and whip up the
initial test cases. When the design is stabilizing, I tend to
first replace smart pointers (and raw pointers) by pointer
wrappers that support hunting double deletion and memory
leaks, and finally by pointer wrappers that wrap new and
delete and provide hooks for an allocator to be specified by
the client code.
Interesting. I use external tools for much of this. For memory
management within the process, I usually use the Boehm
collector; why should I have to worry about which smart pointer
to use, or how to break a cycle, *IF* the object doesn't have an
explicit lifetime (i.e. if there is no explicit behavior
associated with its ceasing to exist). For the rest, I'll use
Purify or valgrind, or even my own debugging new/delete.
In the applications I work on, of course, such low level library
code represents something like 1% or 2% of the total code base.
And for the most part, we don't write it; the standard
containers are sufficient (with wrappers, in general, to provide
a more convenient interface).
They obviously don't apply to entity objects, whose
lifetime must be explicitly managed. And how many other
things would you allocate dynamically?
I forgot to mention one other reason to use T* instead of T:
in template programming, the first is available for incomplete
T. For instance, there are two obvious implementation of the
box container (a box can be empty or contain a single item; I
think such a container is sometimes called fallible or
optional). One implementation has a T* data field and the
other has a T data field. The first will work with incomplete
T the second won't. When doing template programming, one has
to be aware of the conceptual requirements created by an
implementation approach. Sometimes, that forces or suggests
dynamic allocation.
Very good point. Writing good, generic templates is difficult,
and typically, you do end up violating a number of rules that
would apply elsewhere. The results are often much more
difficult to understand, as well. Yet another reason why a lot
of companies don't like templates at the application level. (In
most of my applications, the majority of the application
programmers are domain specialists, and not C++ or software
engineering specialists. My role in such projects is usually to
handle such low level or generic stuff in such a way that the
application programmers don't have to worry about it.)
[...]
I did not want to argue for or against "delete this". I can
see how this idiom is useful. I was just flabbergasted by your
claim that most deletes are of this form. But now, I can see
where you were coming from.
Yes. There was also some intentional hyperbole in my statement.
There are definitly cases where "delete this" isn't the rule.
I've worked on business systems, for example, where all of that
actual deletes of entity objects where in the Transaction
object---an on stack object which acted as a "temporary" owner
for objects involved in the transaction. The object was
"logically" destructed during the transaction, but the actual
delete only occured commit. It's very difficult to role back an
object that has really been deleted. (Of course, this might be
handled by a delete this in the commit function of the object.)
However, it is somewhat funny that "delete this" looks scary
enough that people invent roundabout ways to avoid it.
In a very real sense, there's something scary about any delete;
you have to be very sure that no one else is using the object,
and that all concerned parties are notified. Delete this is
really no different in this respect. And most of the time I've
seen people try to avoid it, per se, they end up just hiding
it---obfuscating the potential problems; there is no fundamental
difference between "delete this" and
"ObjectManager::instance().removeObject( this )", except that it
is far more explicit in the first case that the object won't
exist after the statement.
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34