Re: Garbage collection in C++
On Nov 19, 5:10 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
On Nov 18, 5:28 am, James Kanze <james.ka...@gmail.com> wrote:
On Nov 18, 5:23 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
On Nov 17, 4:18 am, James Kanze <james.ka...@gmail.com> wrote:
On Nov 16, 1:24 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
I really haven't ever felt the need for a GC engine in my
work. Could a GC engine have made my job easier in a few
cases? Maybe. I can't say for sure. At most it could have
perhaps saved a bit of writing work, but not increased the
correctness of my code in any way. C++ makes it quite easy
to write safe code when you follow some simple rules.
Yes and no. C++ certainly provides a number of tools which
can be used to improve safety. It doesn't require their
use, however, and I've seen a lot of programmers which don't
use them systematically. And of course, human beings being
what they are, regardless of the tools or the process,
mistakes will occasionally creap in.
Yes. However, garbage collection is /only/ going to reclaim
memory, eventually. It's not going to correct the logical and
potentially far more serious design bugs that leaked memory in
the first place.
Woah. If the error is leaked memory, garbage collection may
correct it. Or it may not, depending on whether there is
still a pointer floating around to the memory. (The Java
bugs data base has more than a few cases of memory leaks in
it.)
That's not what garbage collection is for. Garbage
collection isn't designed to make an incorrect program
correct---I don't think any tool can guarantee that.
And I never claimed GC does or was designed to correct errors.
One can see from the attached context it was you who posited
"mistakes will occasionally creap in" not I. If you did not
mean memory leaks what did you mean?
Nothing in particular. Just that regardless of the technique
used, code written by human beings will contain errors; your
development process should be designed to detect and remove them
as far upstream as possible. (This in response to your claim
that garbage collection masks errors, where as in fact, it makes
the detection of some errors, like dangling pointers, possible.)
Garbage collection (like all of the other tools I know) is
designed to make it easier to write a correct program. It
also makes the effects of some errors (dangling pointers)
less critical.
And it's exactly by lessening the effects of some errors that
GC can actually /hide/ those errors.
No. It is exactly by lessening the effects of those errors that
it makes their detection possible. (And also prevents them from
being used as a security hole---very important if your
connecting to the web.)
Perhaps I'm missing some crucial point. Let me put a toy
example together:
C++
Foo * x = new Foo() ;
//in a code far far away a reference is squirreled away
Foo * y = getX() ;
//time passes, we want x to never be used again
delete x ;
//in a code far far away the squirreled digs up his nut
y->activate()
Java
Foo x = new Foo() ;
//in a code far far away a reference is squirreled away
Foo y = getX() ;
//time passes, we want x to never be used again so what do
//you put here to indicate this? Roll your own "zombify"?
//in a code far far away the squirreled digs up his nut
y.activate()
In the C++ version, Purify (or similar) will catch the
dangling pointer or if it sneaks by (as you say "mistakes will
creep in") you have at least some a chance that the code cores
and reveal the error. In Java (and in GC in general?) you will
never know. What am I missing?
Purify will catch the error, but delivered code doesn't run
under Purify, so if the error doesn't show up in your test
cases, you're hosed without garbage collection; you have
undefined behavior, and while it might core dump. It might also
do anything else. Including (as has actually happened in one
case) allowing someone connected to your server to break into
your machine (and if the server is running as root, to do pretty
much anything it wants with root privileges). With garbage
collection, of course, there is no undefined behavior; you set
whatever bits you need to identify the error in the
deconstructed object, and you test them with each use of the
object, handling the detected error however you think best. (I
like assert for this, so I know I get the core dump.)
The problem is that in C++, when you deconstruct an object, you
also free the memory, and that memory can be reused for another
object, so you can't guarantee any state which would identify it
as having been deconstructed. When you deconstruct an object
and are using garbage collection, you can scribble all over the
object, overwriting it with values that can't possibly be legal
vptr's, and you can be moderately sure that those values won't
be overwritten as long as the ex-object is still accessible.
This is a case where garbage collection is necessary for maximum
robustness. But it obviously doesn't solve everything. You
can still dangle pointers to local objects, and a rogue pointer
can still overwrite anything. In the end, the real question is
how much undefined behavior can you accept; in my experience,
undefined behavior is a sure recepe for reduced robustness. And
garbage collection removes one (and regretfully only one)
potential source of undefined behavior.
In fact, garbage collection can and does hide bugs exactly
by allowing access to objects that should not be accessed
thus actually reducing correctness. How do you respond to
this?
How should I respond to some wild and erroneous claim? In
fact, garbage collection helps to detect bugs; it is
necessary in order to effectively detect precisely the bug
you describe.
Really? Please explain how GC helps rather than hinders in the
toy scenario I gave above.
Just did. It replaces undefined behavior with defined behavior.
Which you can define to do whatever is appropriate.
Without garbage collection, it's undefined behavior; with
garbagea collection, it's a testable condition.
What does GC help you test exactly? Zombie access?
For zombies resulting from deconstruction. (Most of the time, I
think, zombie state is used to describe objects which you
weren't able to correctly construct to begin with. For those,
of course, exceptions provide the solution.)
The various C++ techniques that of course are familiar to
you for managing memory deterministically not only help
one prevent garbage memory,
They also require additional work.
And they can have additional benefits.
In certain cases, certainly. When they have enough additional
benefits to offest the additional cost, fine; the presence of
garbage collection doesn't prevent their use. Most of the time,
this isn't the case, however.
they also help one properly manage other scare resources
which garbage collection does nothing for. How do you
respond to this?
Different "resources" have different constraints. Garbage
collection is fine for memory. RAII often (usually?) works
well for locks and such. You need explicit, programmer
controlled management for resources such as open files,
where "release" can fail. One size doesn't fit all.
That's why there is a large toolbox of deterministic tools.
And that's why adding an additional tool fits into the model so
well.
As an aside, it seems that the simple constructor/destructor
paradigm has proven to be extremely flexible in implementing a
variety of resource management solutions. Do you agree?
More or less. The destructor paradigm certainly rates as one of
C++'s successes, and IMHO, beats finally hands down. Which
doesn't mean that finally wouldn't be nice as well. Nothing
wrong with having a choice. (I'd actually like to see a way of
creating "destructors" ad hoc. Something along the lines of:
cleanup { code } ;
, which would basically create an anonymous variable whose
destructor executes the code. Finally is nice when defining a
full blown class would be overly verbose, but I like the idea of
being able to write the finally code near the code which
provoked the need for it.)
Is it not better to learn the more general more
comprehensive deterministic resource management paradigms
that C++ supports? And to apply them uniformly and
widely?
You need to understand many different types of resource
management (and often transaction management in
general---the problem isn't just resources) if you want to
write correct code. Garbage collection doesn't dumb down
the language, so that idiots can use it. It just means that
an intelligent programmer has less lines of code to write.
No more, no less.
Yes it does not "dumb down the language" but it is also not
free. Of course you must know that GC comes with various costs
so it's not worth going into them (yet again).
Well, I'm not sure what costs you're refering to. Most of the
argument against garbage collection seems to be that it will
turn programmers into idiots, which I don't buy. It obviously
means more work for the implementors, but in that respect, it is
nothing compared to two phase lookup for templates. And for the
user, it fits perfectly into the C++ requirement of "you don't
pay for it if you don't use it". While not really required, in
the strictest sense, I'd say that for the implementations I use,
you should count on doubling your heap usage; if your program is
at the limits, then it's something you can't afford to pay for,
regardless of any other advantages, but otherwise... (The issue
is actually more complex than that. The implementation I use
has provisions for allocating non-garbage collected memory as
well, so if you have a program which allocates a couple of very
large arrays of simple types, you can allocate them separately.
And the heap use that doubles is that of objects which are being
allocated and freed; if you allocate a couple of mega in objects
that are never freed, you don't have to double that.)
C++ is a multi-paradigm language, usable in many
contexts. If you're writing kernel code, a garbage
collector certainly has no place; nor do exceptions, for
that matter. And if you're implementing a garbage
collector, obviously, you can't use it. But for most
application programs, it's stupid not to.
RAII, RRID, STL containers, automatic variables, value
types, and other software design patterns have served
exceptionally well in eliminating both the need and the
want for GC for me.
Exactly. You've mentionned a number of very useful tools.
Garbage collection is just one more to add to the list.
Sometimes, it will mean that you need to write less code.
Well, I agree with you! It is "just one more" tool and
sometimes it means you write less code. That said it does not
come without cost and those who require it less and value it
less than you are not stupid.
There's a difference. Those who decide in a particular
application that it isn't appropriate aren't stupid. Those who
refuse to consider it, on the other hand, are certainly showing
unreasonable prejudice. As a professional, I have a
responsibility to my clients to provide the best service
possible at the lowest possible cost. Not using a tool which
would result in a more robust program at a lower price would be
a serious violation of professional ontology. And I can't know
whether the tool would result in a more robust program at a
lower price in any particular case unless I consider it with an
open mind.
[...]
Thanks for the continued discussion and your expertise!
I was about to say the same thing. (And don't take any harsh
statements I may have made at the beginning of the discussion
too literally. I like to exercise rhetoric litote, exagerating
a statement to bring a point home. I never mean it personally,
and I certainly don't think that everyone who doesn't see an
immediate need for garbage collection is stupid.)
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=C3=A9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=C3=A9mard, 78210 St.-Cyr-l'=C3=89cole, France, +33 (0)1 30 23 00 =
34