Re: array size known/not known in compile time

From:

"Alf P. Steinbach" <alfps@start.no>

Newsgroups:

comp.lang.c++.moderated

Date:

22 Nov 2006 22:33:05 -0500

Message-ID:

<4skdnoF10e32tU1@mid.individual.net>

[First, let me just state that I don't have the time to follow up on all
or even most replies in this thread, and that this single article should
ideally be split in a number of sub-threads or new threads: I haven't
time for that either]

* James Kanze:

Alf P. Steinbach wrote:

[...]

Stack allocation is generally many orders of magnitude faster than
dynamic allocation, because dynamic allocation must search for suitable
free areas of memory.

That's true on modern architectures, but on most of the machines
I've worked on, "stack" frames were fixed size blocks, allocated
using the same mechanisms that malloc used.

Ouch! :-) On /most/ machines you've worked on stack allocation was as
slow as dynamic allocation? Please elaborate with concrete examples --
not that it probably matters wrt. VLAs today, or is even strictly
on-topic in clc++m, but that's what I call an INTERESTING statement!

One of the
arguments in favor of VLA in the C committee was that it was
easy to implement using malloc/free.

Let's ignore the politics of committees. When they've run out of real
arguments, which sometimes happens quickly, the discussion degenerates
to academic vicarious points far removed from reality, instead founded
on who supports whom (alliances) and so on. Yes, it's not ungood that
VLAs can be implemented via dynamic allocation, but it's not relevant to
the introduction of VLAs in C++ other than as an academic argument.

On the other hand, associated, it's Very Good that VLAs can be
/specified/ in terms of dynamic allocation.

Of course that doesn't mean that VLAs (ideally) should be /implemented/
that way -- the whole point being that they allow a generally much
more efficient implementation.

Note too that modern dynamic allocation mechanisms have become
quite fast as well.

We're talking orders of magnitude, not just a factor of 2 or something,
for the general case.

And of course, that most VLA's aren't
allocated on the stack anyway

I think that's very incorrect. Anyway, it must be referring to C
programmers' evil practices. We're talking about future C++.

---the main motivation for VLA's in
C is to provide a legal alternative to the dirty struct hack.

Again, let's forget about the C committee and their possible motivations
founded on C programming requirements and (probably) internal politics
-- which is not to say anything ill of the C committe(s) members, who
probably do a very good job, for some of them perhaps unpaid, praise be
upon them, but politics is just part of the way committees work, and not
much is accomplished without it, so probably it's there.

The C++ standard doesn't say anything about a stack in the firts place.

Oh, it does. I gave you one reference where a stack was implied. And
that implied stack is referred to elsewhere in the standard as a "stack"
in the phrase "stack unwinding" -- the terminology here is well
established and entrenched, and I don't think any attempt to eradicate
this reality-centered terminology on the grounds that the logical stack
behavior might in theory be implemented in other ways, will succeed.

Function frames implicitly form a stack in C and in C++. The
standard says nothing about how this stack is actually
implemented, and while most implementations take the simples
(and least robust) approach, I've used at least two
implementations where the stack was a linked list of blocks,
allocated via malloc (or a malloc-like mechanism) at function
start-up. One of the two was horribly slow, but the other
actually held up pretty well. (Of course, the architecture had
direct hardware support for linked lists, and no machine stack,
nor any extra registers to facilitate simulating one.)

You're saying there has been at least one hugely inefficient
implementation of the C++ automatic storage stack.

And that therefore, presumably, if this statement is to be meaningful,
an efficiency argument in favor of VLAs is invalid.

Well, I don't buy that.

There has been at least one hugely inefficient implementation of C++,
therefore any efficiency argument is invalid.

Bah.

[...]

(2) VLA support at the language level, whatever the form (hopefully
not C99),

Now *that's* an interesting comment. I was definitly under the
impression that we were talking about adopting the C extension
into C++. If you are talking about something else, what exactly
is the proposal?

So far just discussing fixed size array where that size is specified at
run-time, with the understanding that the point is to support efficient
stack allocation under programmer's control.

I agree that a full concrete proposal would be nice, eventually.

For now, in light of the many strong emotions surfacing in this thread
(talking of Camels and their noses and whatnot), and the corresponding
lack of concrete arguments, I think it's a Good Idea to focus on what
can be /easily/ introduced and what advantages and disadvantages that
would have, very concretely.

So, to meet you and others in the quest for something concrete to
discuss, consider this minimum proposal:

   C++ VLA support.

   An automatic storage duration variable can be declared as

      T someName[n];

   or

      T someName[n] = { initializerList };

   where n is a non-constant integral expression, and where the
   declaration can be prefixed with 'auto', if that old general usage of
   'auto' isn't deprecated.

   The first form would be /equivalent/, except for efficiency, to

      dynamic_size_array<T> const someName( n );

   if dynamic_size_array constructors were accessible (which they
   aren't), and the second form would correspondingly be /equivalent/,
   except for efficiency and user code access, to

      static T const __values[] = { initializerList };
      static size_t const __nValues = ...;
      dynamic_size_array<T> const someName(
          n,
          __values,
          __values + __nValues
          );

   where __values and __nValues denote unique generated names, and

      namespace std
      {
          class bad_array_size: public runtime_error{ ... };

          template< typename T >
          class dynamic_size_array
          {
          private:
              T* p_;
              size_t n_;

              static void* alloc( size_t nBytes ) throw();
              static void dealloc( void* p ) throw();

              static size_t validN( size_t n, size_t min )
              {
                  if( n < min ) { throw bad_array_size; }
                  // And if size_t range exceeded for nBytes then throw.
                  return n;
              }

              dynamic_size_array( dynamic_size_array const& ); // None.
              dynamic_size_array& operator=( dynamic_size_array const& );

              dynamic_size_array( size_t n )
                  : n_( validN( n, n ) )
                  ...
              { ... }

              dynamic_size_array( size_t n, T* first, T* end )
                  : n_( validN( n, end - first ) )
                  ...
              { ... }

          public:
              size_t size() const { return n_; }
              operator T*() const { return p_; }

              T* begin() const { return p_; }
              T* end() const { return p_ + n_; }

              // Plus whatever, but calling alloc in constructors and
              // dealloc in the destructor, and making no constructors
              // available, and where alloc guarantees to either succeed,
              // or invoke UB for the case where an automatic POD
              // variable of the same byte size (possibly plus a
              // specified extra) would have invoked UB.
          };
      }

    where 'alloc' is called on construction to allocate at least
    n_*sizeof(T) bytes, the resulting pointer placed in p_, and where
    the constructor(s) perform in-place construction of n_ T objects
    in the allocated memory, and the destructor performs ditto in-place
    destruction, in opposite order of the construction.

One potential problem I see with this kind of spec is the interaction
with local classes: with the current language this spec forbids DSAs
(hey, a new acronym!) of local class element type, because local classes
can't be used as template parameters.

Another problem relating to the firm of the spec is that it seems it
doesn't allow an empty initializer list. That can probably be fixed. ;-)

Of course it breaks away from C99 VLAs, which is intentional, especially
for the result of sizeof, but perhaps as a bonus it may make it possible
to compile a lot of "reasonable" C99 code, which is the reason for the
conventional-array-like declaration syntax instead of a class name.

would remove the practical need for non-standard and unsafe
stack allocation functions such as alloca,

The reason why alloca is non-standard and unsafe is that it
isn't present everywhere,

You mean, the reason that alloca isn't present everywhere is that it
isn't standard. And a good thing is that. It's a dangerous and very
low-level construct: much better to have VLA (or DSAs!) standard!

and that in fact, it cannot be easily
implemented on many architectures.

Let's not debate esoteric, archaic architectures and the subjective
hardship of implementing anything on those architectures, like modern
C++ on the ZX80 or the ENIAC; I mean, that's doubly irrelevant.

thus improving C++ code
quality, reducing development time, and increasing portability. (3) Not
my argument, but some argue that for multidimensional arrays it's
somehow better than std::vector in at least some cases -- I don't know
what.

If nothing else, the syntax for declaring them:-). But then, of
course, you wouldn't want to limit them to local variables.

I would, yes, and more, I'd limit them to non-static local variables,
i.e. automatic variables, because stack allocation (the point of this,
in C++) is incompatible with deferred initialization of statics.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]