Re: Minimizing Dynamic Memory Allocation

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Tue, 27 Jan 2009 02:00:01 -0800 (PST)

Message-ID:

<93f254e4-4b38-498e-8f09-7c854af60b89@z28g2000prd.googlegroups.com>

On Jan 27, 8:50 am, "Alf P. Steinbach" <al...@start.no> wrote:

* James Kanze:

On Jan 24, 10:11 pm, "Alf P. Steinbach" <al...@start.no> wrote:

* ma740988:

[...]

and I'm having a hard time making it pass section 2.1.
Section 2.1 reads

-------
Section 2.1 Minimizing Dynamic Memory Allocation
Avoid the use of C's malloc and free.

Good advice for general C++, but of course there are no
absolutes, e.g., sometimes some library routine will require
you to 'malloc' some argument or 'free' some result.

Then apply rule 0 (or what should be rule 0 in any reasonable
coding guidelines): any other rule in the guideline can be
violated if there is a justifiable reason for doing so. This
reason must be documented in the code (comment, etc.), and
approved by...

They're called guidelines, and not laws, for a reason. In the
end, if it passes code review, it's OK. Violating a guideline
is a justifiable reason for the reviewers to reject it, but
they aren't required to do so, if they're convinced that the
violation was justified.

Code reviews are less often performed in the real world than
you seem to think.

Probably. But without code reviews, there's no point in having
coding guidelines to begin with.

On the other hand, your idea of a rule 0 in any coding
guideline is a very good one.

It's interesting, because it is one of the first things I read
about coding guidelines.

For if programming could be reduced to a simple set of
mechanical rules it would have been so reduced, long ago.

Yep. And we'd be out of work (or at least paid a lot, lot
less):-).

Things that can be automatically checked (like use of
malloc/free) may require some sort of special comment to
turn the checking off in a particular case.

It seems you're envisioning some automated
code-guideline-conformity checker.

Yes and no. If some of the conformity checking is automated,
then you need some special handling for cases where rule 0 was
applied. (I've actually seen it done.)

Given the level of the OP's firm's new (proposed) guidelines
that's an entirely unreasonable supposition.

I have no idea what is going on in the OP's organization. But
I've seen some very badly organized places try to use automated
tools to impose bad rules. My point is just that if automated
checking is envisaged, then some provision for rule 0 must be
made.

There's also the point that using such a checker would amount
to have given up on quality and instead focusing on enforcing
unthinking rule conformity, and that kind of conformity is not
a good idea when what you need is intelligent, creative staff.

Some degree of rule conformity is a good idea, although it
certainly doesn't replace an intelligent, creative staff. And
it's always a good idea to automate what you can. What little
experience I've had with automated rule checking has been very
positive, but it was with a very good organization, which
understood the software development process very well. I've no
doubt that it can be seriously misused, however.

Instead use (sparingly) C++'s new and delete operators.

Also good advice, including that word 'sparingly'.

For a definition of 'sparingly' which depends on the
application. I can easily imagine numeric applications
where no explicit use of new or delete is appropriate. And
I've worked on server applications where almost everything
was allocated dynamically, by necessity.

Which does not mean it's a good idea to pepper the code with
'new' calls.

Centralize that functionality.

Agreed. There should normally be few instances of new
expressions, even in applications where most objects are
dynamically allocated.

I'm sure that there are exceptions to even this, of course. My
real point is not that you are wrong, per se, but that the
definition of 'sparingly' will depend on the application.

The real rule is more complex; it implies understanding what
you are doing. Some simple rules can be applied, however:

-- Never use dynamic allocation if the lifetime corresponds to
a standard lifetime and the type and size are known at
compile time.

std::scoped_ptr, bye bye. :-)

If the type and size are not known at compile time, it's
actually a very good solution. If the type and size are known,
and the lifetime corresponds to the scope, why would you
dynamically allocate?

But I'm sure you're thinking of some given application domain
familiar to you.

In this case, I think it's a general rule, applicable to all
applications. If the type and size are known at compile time,
and the lifetime corresponds to standard lifetime (i.e. a
lifetime defined by the language standard), then you should not
allocate dynamically.

-- If the object type doesn't require identity, and the
    lifetime does not correspond to a standard lifetime, copy it
    rather than using dynamic allocation, unless the profiler
    says otherwise.

To avoid memory leaks, a clear understanding of resource
acquisition (allocation and release) is important.

This is true.

To avoid leaks, all classes should include a destructor
that releases any memory allocated by the class' constructor.

This is just dumb advice.

And it contradicts the advice immediately above: "have a clear
understanding of resource acquisition".

Use types that manage their own memory.

Which is exactly what the "advice" you just called dumb said.

Nope.

The "advice" in the proposed guidelines was to define a
destructor in every class.

Where did you see that? I don't see anything about "defining a
destructor". Just ensuring that the class has one (which all
classes do), and that it deletes any memory allocated in the
constructor. The obvious way of achieving that would be for all
of the pointers to memory allocated by the class be scoped_ptr.

That's dumb on its own, as opposed to using resource-specific
management wrappers such as smart pointers and standard
library container classes. And it's even more dumb, downright
catastrophic advice, when the author of the guidelines isn't
familiar with rule of 3 and forgets to mention it here.

Hopefully, there would be other rules, concerning when and how
to support copy and assignment. Given the way this rule is
formulated, however, I can understand your doubts about this.
But they would be "other rules", and not necessarily part of
this one.

The opposite of that approach is to use types that manage
their own memory, i.e., the opposite of mindless duplication
and continuous re-invention of resource management is to
centralize it.

The necessity of requiring memory management to be pushed off
into a separate class doesn't seem correct to me. What's
important is to use standard and well understood idioms
(patterns). Thus, if you're using the compilation firewall, the
standard idiom is to provide a non-inline destructor which does
the delete---any attempt to use smart pointers is likely to get
you into problems with instantiating templates over an
incomplete type. (Such problems can be avoided by careful
choice of the pointer type, or by implementing one of your own.
But why bother?)

Again: there are three basic reasons to use dynamic allocation:
the size isn't known at compile time, the type isn't known at
compile time, and the lifetime of the object doesn't correspond
to a standard lifetime. The first is (usually, at least) best
handled by using a standard container, e.g. std::vector. The
second is (usually) best handled by a smart pointer (e.g.
boost::scoped_ptr), and the third needs explicit management---it
may be possible to design a smart pointer to handle it in
explicit cases, but not generally.

Beyond that, there are a few special idioms which require
dynamic allocation for other reasons---things like the
compilation firewall idiom (which is really a variant of not
knowing the exact type, or the details of the type, at least
when compiling client code) or the singleton pattern (but
usually, you're also trying to obtain a non-standard lifetime,
e.g. the destructor is never called---otherwise, you use a local
static). But they're just that, established and well known
patterns.

[...]

A good general rule is to require any single class to be
responsible for at most one resource. If a class needs several,
it should spawn the management of each off into a separate base
class or member. (There are times when this simplifies life
even when the derived class only needs a single resource. See
the g++ implementation of std::vector, for example.)

As a general rule I agree with that. Good advice.

Finally:-).

I think our only difference (here, at least) is that you seem to
be suggesting that this class should usually look like a
pointer. Where as I find that the cases where it should look
like a pointer aren't all that frequent.

Finally, always set a 'new_handler' using the built-in
function set_new_handler.

This is good advice.

The default new_handler terminates the program when it
cannot satisfy a memory request.

This is incorrect.

Program termination at a critical time may be disastrous.

And this is just dumb.

More to the point, it doesn't mean anything. And it's not
implementable, at least not reasonably. If the OS crashes,
for example, the program will terminate, or it may terminate
because of a hardware failure. The total system must be so
designed that this isn't "disastrous" (i.e. it doesn't
result in a loss of lives or destruction of property).

The reason you should set a new_handler is in order to
override the default exception throwing behavior and
instead let it terminate the program.

Because that's about the only sane way to deal with memory
exhaustion.

*That* depends largely on the application. In many cases,
you're right, but certainly not all, and there are
applications which can (and must) recover correctly to
insufficient memory. (It's tricky to get right, of course,
since you also have to protect against things like stack
overflow and buggy systems which tell you you've got memory
when you haven't.)

Hm, I'd like to see any application that, from a modern point
of view, can deal safely with a std::bad_alloc other than by
terminating.

A well written, robust LDAP server. There's absolutely no
reason why it should crash or otherwise terminate just because
some idiot client sends an impossibly complicated request. And
there's really no reason why it should artificially limit the
complexity of the request.

However, if you look up old clc++m threads you'll perhaps find
the long one where I argued with Dave Abrahams that one
reasonable approach, for some applications, is to use "hard"
exceptions in order to clean up upwards in the call stack
before terminating (the problem being that C++ lacks support
for "hard" exceptions, so that the scheme would have to rely
on convention, which is always brittle).

E.g. it might be possible to save an open document to disk
using only finite pre-allocated resources.

But still I think the only sane way to deal with memory
exhaustion is to terminate, whether it happens more or less
directly at the detection point or via some "hard" exception.

Then it may perhaps be possible to do a "bird of Phoenix"
thing, as e.g. the Windows Explorer shell often does, or
perhaps just a reboot. Or one might have three machines
running and voting, as in the space shuttle. Or whatever, but
continuing on when memory has been exhausted is, IMHO, simply
a Very Bad Idea.

What about a server whose requests can contain arbitrary
expressions (e.g. in ASN.1 or XML---both of which support
nesting)? The server parses the requests into a tree; since the
total size and the types of the individual nodes aren't known at
compile time, it must use dynamic allocation. So what happens
when you receive a request with literally billions of nodes? Do
you terminate the server? Do you artificially limit the number
of nodes so that you can't run out of memory? (But the limit
would have to be unreasonably small, since you don't want to
crash if you receive the requests from several different clients,
in different threads.) Or do you catch bad_alloc (and stack
overflow, which requires implementation specific code), free up
the already allocated nodes, and return an "insufficient
resources" error.

It's not a typical case, and most of the time, I agree with you.
But such cases do exist, and if the OP's organization is dealing
with one, then handling bad_alloc can make sense.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34