Re: array size known/not known in compile time

From:

"James Kanze" <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

23 Nov 2006 13:44:05 -0500

Message-ID:

<1164304947.290236.321070@h54g2000cwb.googlegroups.com>

Alf P. Steinbach wrote:

* James Kanze:

Alf P. Steinbach wrote:

[...]

Stack allocation is generally many orders of magnitude faster than
dynamic allocation, because dynamic allocation must search for suitable
free areas of memory.

That's true on modern architectures, but on most of the machines
I've worked on, "stack" frames were fixed size blocks, allocated
using the same mechanisms that malloc used.

Ouch! :-) On /most/ machines you've worked on stack allocation was as
slow as dynamic allocation? Please elaborate with concrete examples --

IBM mainframes, Siemens mainframes, Texas Instruments 9000, some
small French manufacturers. IBM mainframes are probably the
only important ones today.

Historically speaking, most machines didn't have hardware
support for a stack. (This is why traditional Fortran didn't
allow recursion.) I think that the PDP-11 was one of the
first that did. It caught on:-).

What that meant on performance varied. A function call was
incredibly expensive on the Siemens mainframe implementation of
C I knew. On the other hand, one of the small French machines
had direct hardware support for linked lists, and could set up a
dynamically allocated "stack" frame very quickly---provided its
size was a constant.

[...]

arguments in favor of VLA in the C committee was that it was
easy to implement using malloc/free.

Let's ignore the politics of committees.

Fine with me, but that's not the way to get something
adopted:-).

[...]

Note too that modern dynamic allocation mechanisms have become
quite fast as well.

We're talking orders of magnitude, not just a factor of 2 or something,
for the general case.

We're talking of the allocation repesenting, say, less than 10%
of the time necessary to iterate over the contents. After all,
you're not going to allocate an array and then not access the
elements.

And of course, that most VLA's aren't
allocated on the stack anyway

I think that's very incorrect. Anyway, it must be referring to C
programmers' evil practices.

I'm talking about the reason why VLA was first introduced into C
to begin with. And yes, I think that "C programmers' evil
practices" sums it up pretty well.

[...]

There has been at least one hugely inefficient implementation of C++,
therefore any efficiency argument is invalid.

The efficiency argument applies for one very small subset of the
*proposed* VLA's usage. If you want to make a more limited
proposal, in which VLA's are only legal in the few cases where
they might lead to better performance, by all means do so, and
we can discuss it.

[...]

(2) VLA support at the language level, whatever the form (hopefully
not C99),

Now *that's* an interesting comment. I was definitly under the
impression that we were talking about adopting the C extension
into C++. If you are talking about something else, what exactly
is the proposal?

So far just discussing fixed size array where that size is specified at
run-time, with the understanding that the point is to support efficient
stack allocation under programmer's control.

I agree that a full concrete proposal would be nice, eventually.

Agreed. It's hard to discuss something without knowing what is
actually being discussed. I can see two concrete advantages
linked with VLA's. One is the performance argument: although I
don't think it is as absolute as you originally presented it,
it's probably true that in at least some cases of auto
variables, the compiler will be able to use something like the
old alloca technique, at least on a lot of the more frequently
used machines. (And to give you a point that you haven't
mentionned yet: even when you need to use malloc to implement
it, it's not worse in performance than std::vector.) The other
is the syntax for declaring multi-dimensional arrays---all
performance issues aside, who wants to have to write:
    std::vector< std::vector< std::vector< double > > >
                    v( l, std::vector< std::vector< double > >(
                       m, std::vector< double >( n ) ) ) ;
instead of:
    double v[ l ][ m ][ n ] ;
? (Not to mention the safety---with the std::vector approach,
you can accidentally change the length of one of the rows after
the array has been created.)

Now comes, of course, the fun part. If you allow this sort of
thing for local variables, you really should allow it for class
members and static variables as well. Of course, in such cases
(and in well written C++, I would imagine that class members
would represent the majority of use), you're down to one
argument, because I don't think that you can avoid the malloc in
such cases. (Still, it could be done with one malloc. The
std::vector above is going to use l*m + l + 1 allocations, at
least.)

The second issue which has to be addressed is the effects on the
type system. What happens to such types in overload resolution,
or when instantiating templates? (In the case of std::vector,
these issues resolve themselves indirectly---because in the end,
std::vector is just another user written class.)

For now, in light of the many strong emotions surfacing in this thread
(talking of Camels and their noses and whatnot), and the corresponding
lack of concrete arguments, I think it's a Good Idea to focus on what
can be /easily/ introduced and what advantages and disadvantages that
would have, very concretely.

So, to meet you and others in the quest for something concrete to
discuss, consider this minimum proposal:

   C++ VLA support.

   An automatic storage duration variable can be declared as

      T someName[n];

   or

      T someName[n] = { initializerList };

   where n is a non-constant integral expression, and where the
   declaration can be prefixed with 'auto', if that old general usage of
   'auto' isn't deprecated.

So you are proposing a type which cannot be used except as an
auto variable?

IMHO, the syntax argument is far stronger than the performance
one. If you are really ready to do the work, I'm in favor of
it. (I definitly like one part of the proposal---that for all
external intents and purposes, the new types act pretty much
like a template class would, e.g. that sizeof works like it does
with vector, and returns a constant---which generally has no
meaning, but since you've got a member size()...)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient?e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S?mard, 78210 St.-Cyr-l'?cole, France, +33 (0)1 30 23 00 34

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]