Re: garbage collection and latency

From:

"kanze" <kanze@gabi-soft.fr>

Newsgroups:

comp.lang.c++.moderated

Date:

25 Jul 2006 07:47:36 -0400

Message-ID:

<1153818789.922144.95820@m73g2000cwd.googlegroups.com>

Rupert Kittinger wrote:

kanze schrieb:

Rupert Kittinger wrote:

I have been following the ongoing thread about garbage
collection, but there is one point I miss in the
discussion: how will garbage collection impact the
usability of c++ for realtime-applications?

In the case of hard real time, it depends. You have all of
the same problems you have with dynamic memory allocation in
general. Depending on the collector used, they may be
worse---for the most widely used collectors, compared with
the most widely used allocators, and considering more or
less "typical" uses, garbage collectors will probably show
more latency, but there exist real-time collectors as well,
with a guaranteed maximum latency.

can you provide some links for bounded latency collectors?

Not directly; I've never needed it myself. But googling for
"real time garbage collection" turns up a number of hits, and
Baker's collector is cited in the garbage collection FAQ (but
without a direct link).

In my experience, modern memory allocators have very low
latency if the allocation patterns are stable, and can be
used in "soft" realtime applications without problems. How
about existing GC implementations?

For soft realtime applications, they typically pose no
problem. In fact, if you arrange to explicitly trigger
garbage collection at non-critical moments, they will often
react better than the usual memory allocators.

I am mainly interested in real-world experience, can anybody
provide hard numbers?

It's difficult to provide hard numbers given the wide range
of equipment, implementations and algorithms in use---both
for garbage collection and explicit memory management. In
practice, a number of real interactive applications use
garbage collection, without noticeable pauses. (Emacs comes
to mind.)

In practice, if you have an application which allocates
large amounts of memory, then frees it, you will have to
consider latency issues. Both with garbage collection and
with manual management.

there are several reasons that would make me uncomfortable
using GC, even in soft realtime setting. Maybe some of them
are not justified :-)

- with explicit memory management, sources of latency can be
controlled more easily (e.g. with profilers), because they
are, well, explicit. Generally speaking, I suppose measuring
GC latency will be more difficult, That's why I am interested
to hear from people who did this.

The issue is that neither typically give a guaranteed latency.
Obviously, if you have concrete experience with one, including
solving latency problems, and not with the other, you'll know
better how to manage and avoid problems in the one you are
familiar with. If things normally "just work", without your
having to do anything particular, there's a good chance that the
same thing will apply with garbage collection---your constraints
are soft enough so that there won't be a problem. If you
occasionnally have to tune something with explicit management,
you might also have to do so with garbage collection, and there
will be a learning curve in doing so.

- with explicit memory management, you can use different
threads for tasks with different latency requirements. All
garbage collectors I know if will stop all threads during
collection.

That's true of most collectors, but not all. But it's probably
true that it is easier to control which thread does most of the
work with manual collection.

The usual trick with garbage collection isn't to switch it off
to a different thread; the usual trick is to arrange for it to
run when you have time available, or when there is a minimum of
memory in actual use.

- with GC, I would expect latency to be a function of the
whole memory footprint of the program. Basically, I think GC
is introducing some kind of coupling between all the memory,
starting with thread-specific storage and ending with mapped
files or shared memory, and this gives me a "bad feeling" with
respect to latency.

Sort of. Again, it depends on the algorithm, and incremental
collectors are very good at only looking at the most active part
of the memory. Normally, too, you can declare parts of memory
as not having pointers, and the garbage collector won't bother
looking at them. (This is very useful if you have large
bitmaps, and such. Especially if you are using a conservative
collector.) Typical collectors seem to be much faster than the
normal malloc when allocating, and the time they need for
collecting seems to depend mostly on the amount of memory in use
at that time. Triggering the collection at moments when very
little memory is in use, or when you would otherwise be waiting,
are two strategies which can usually reduce latency to
practically nothing.

Still, most of the applications I've done using garbage
collection (including applications in other languages) has
simply not required anything special. They just worked, with no
perceived pauses. Whether this is because the garbage collector
we used didn't have a large latency, or because the latency got
lost in a lot of other variable time aspects, I don't know.
(All of the applications communicated over a network, which is
another source of very variable latencies.)

The first application I used which used garbage collection was
emacs, back in 1992. And every so often, it would visibly
pause, displaying a message "garbage collecting". It's been a
long time, so I don't remember how long the pause was, but it
was definitly irritating, at least to me. Today, most of the
applications I use use garbage collection somewhere: it's
ubiquious on the network (where a lot of server code is written
in PHP, Perl or Java), for example. Even emacs still uses
garbage collection. And I can't think of a case where the
latency was noticeable. Whether this is because garbage
collection has improved, or because hardware has improved
(although I still use emacs on some pretty old machines), or
simply because we've become more tolerant (because let's face
it, the Internet is horrible in this respect, and we all use it
anyway), I don't know. Probably a combination of all three.

It is, at any rate, an issue that has to be considered, and
if low latency is important in your application, you should
make some tests before trying to convert the entire
application.

I do not intend to use GC any time soon (and I would have a
hard time selling it to my company). Still it is always good
to know the options that are available :-)

Like any other new technology (new to you or your company), its
introduction should be planned. A critical application with a
very hard, immovable delivery date is not the best place to
start experimenting or learning new technologies. Expect
problems at first, just as you would with any other new
technology.

--
James Kanze GABI Software
Conseils en informatique orient?e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]