Re: Non-virtual destructors & valarray memory allocation

From:

"SuperKoko" <tabkannaz@yahoo.fr>

Newsgroups:

comp.std.c++

Date:

Sat, 10 Jun 2006 09:45:20 CST

Message-ID:

<1149933185.767098.156960@j55g2000cwa.googlegroups.com>

Michael Hopkins wrote:

All I want to do is add one further member function to std::vector<T>. We
use loops from 1 to n here for their natural fit to mathematical and
statistical thinking, and will continue to do so. Changing this would have
a catastrophic effect in terms of bug creation and headaches when thinking
about and expressing algorithms and will not happen for that reason.

As far as possible extension of an old class should be done by adding
non-member functions.
Unfortunately, operator() can't be written as non-member function.

Luckily, C++ has ample scope for expressing this in a type that we can use
exclusively to solve our problem, so we chose the natural solution - a
member function that apes Fortran behaviour with this.

template <typename T>
class uo_vec : public std::vector<T> // for Unit-Offset vector
{
  public:
    T& operator()( const iter i )
    {
      return std::vector<T>::operator[](i - 1);
    }
  etc..
};

[snip]

Correct me if I'm wrong (I often am with C++), but wouldn't aggregation or
private inheritance (or any other approach) require the writing of endless
forwarding functions?

With private inheritance + using directives, you can get it right:

template <class T, class Allocator = std::allocator<T> >
class uo_vec : private std::vector<T, Allocator>
{
  typedef std::vector<T, Allocator> Base;
  public:
  using Base::reference; using Base::const_reference;
  using Base::iterator; using Base::const_iterator;
  using Base::size_type; using Base::difference_type; using
Base::value_type;using Base::allocator_type;
  using Base::pointer; using Base::const_pointer;
  using Base::reverse_iterator; using Base::const_reverse_iterator;

  explicit uo_vec(const Allocator& allocator=
Allocator()):Base(allocator) {}
  explicit uo_vec(typename uo_vec::size_type n, const T& value=T(),
const Allocator& allocator = Allocator()):Base(n, value, allocator) {}
  template <class InputIterator>
  uo_vec(InputIterator first, InputIterator last, const Allocator&
allocator = Allocator())
    :Base(first, last, allocator) {}
  // computer generated copy-constructor, copy-assignment operator and
destructor are correct.

  using Base::assign;
  using Base::get_allocator;
  using Base::begin; using Base::end; using Base::rbegin; using
Base::rend;
  using Base::size; using Base::max_size;using Base::resize;
  using Base::capacity; using Base::empty; using Base::reserve;
  using Base::operator[]; using Base::at;
  using Base::front; using Base::back; using Base::push_back; using
Base::pop_back;
  using Base::insert; using Base::erase;
  using Base::clear;
  void swap(uo_vec& other) {swap(other);}

  T& operator()( const typename Base::size_type i )
  {
    return std::vector<T>::operator[](i - 1);
  }
  const T& operator()( const typename uo_vec::size_type i ) const
  {
    return std::vector<T>::operator[](i - 1);
  }
};

I agree, it isn't beautiful... Furthermore, you'll not be able to use
non-member functions that work only on std::vectors, but as far as I
know you don't want that, because you fear that one could delete your
class via a std::vector pointer.

Anyway, that is the motivation for what we currently use quite successfully,
but there is always the nagging doubt that something one day will refer to
it by a base class pointer and then.. Bang. I would prefer not to have this
worry.

I think that the threat is smaller than you think.
You have to delete, by yourself, all the vectors you create... You only
need to be careful when an old function "takes ownership" of a
std::vector that you have to allocate (for instance, with a
std::auto_ptr<std::vector<int> > sink). In that case, you must be
careful, and pass only a *true* std::vector. But I think that this type
of function/method is seldom.
Calling "delete" on a type which contains a std::vector is not seldom,
but deleting a std::vector from a direct pointer to std::vector is very
unusual.

Note also that fortunately (or unfortunately), deleting a uo_vec from a
std::vector<T>*, causes UB, but this UB is likely, on popular C++
implementations, to do absolutely no harm (i.e. no memory leak, and no
crash).

It seems to me that some of the (undoubtedly excellent) thinking behind
giving C++ such increased expressive power over C at so little cost
efficiency-wise has introduced quite a few subtle bugs and gotchas - so that
you can become paranoid in case you break some arcane rule. Think of the
number of style guides and 'gotchas' books that have come out - surely more
than all other languages put together!

Yes, C++ is a complex language, full of traps for beginners &
intermediate programmers.

What would the practical downside be in giving std::vector & std::valarray
virtual destructors and deriving from them - two bytes per object?

It depends on the architecture... On IA-32, it is more likely to use 4
bytes per object.
It is not negligible. An empty std::vector is likely to use only 12
bytes... 16 bytes is ... 33% more.
And there can be simplier containers (perhaps not STL containers)...
There can be containers using only 4 bytes... In that case, adding 4
bytes double the size of the empty container.
What would be the rationale behind choosing whether a container has a
virtual destructor or not?
Would it be correct to say : std::deque has a virtual destructor
because it is a large complex container, but not boost::array ?

Doesn't seem like a big disadvantage in these days of Gb of memory and wouldn't the
tradeoff be nice; guaranteed type-safe behaviour in containers.

Not all machines have Gb of memory... And these Gb of memory might be
full of 50000000 tiny or empty vectors... In that case, 4 bytes per
vector will increase the size by approximatively 200MB.
C++ must also runs on memory-bound machines... With 640KiB of memory,
or less.

Note that, in your example, you only want a virtual destructor in
std::vector for bug prevention (not because you really need it). Do you
think that a bug prevention of a very very small minority of C++
programmers (those who derive from std::vector) worth the cost of
adding 4 bytes to std::vector (even if this cost is not huge, it is a
real cost).

Question number two was about our need to wrap some C libraries that use
vectors with a few extra elements to hold coded information about length,
orientation etc and extract this info with a defined interface - an early
attempt many years ago to 'objectify' linear algebra objects with C code
that has actually worked very effectively.

We are now interfacing to these 'objects' and their algorithms with a type
that uses std::valarray<T>. I would like to be able to use the member
functions such as min(), max() and others but with some information numbers
'tacked on the end' of a single valarray we can't.

This is not a major headache as e.g. looped min() is not always slower than
member function min() in our tests - it's just about seeing how elegant we
can make the solution. If we could guarantee that the _data_content_ of v1
& v2 below were contiguous (which I suspect we cannot) then we could use v1
for the extra information and treat v2 exactly as a data vector. And when
we need to send it to the C-based functions we could use &v1[0].

    class V {
        std::valarray<double> v1, v2;
    }

Even if you could make v1 data contiguous to v2 (perhaps using custom
allocators), that would be a so ugly design that I would recommend any
other alternative.
Seriously, I would never do that!

I can see two main alternatives :
Having the extra information as members of V, well separated from the
v2 data.
class V {
/* meaningful members*/
std::valarray<double> v2;
};

And perhaps, one or two member functions generating C-compatible
representations (probably at the cost of heavy memory copies).
Ideally, a class should hide the underlying memory structure... And its
interface should be independent of this underlying data structure.

Another alternative is to avoid using std::valarray : Simply use a mean
which gives you more power on the underlying memory representation...
Perhaps a single big std::vector, or memory allocated with
new[]/delete[].
The downside of this approach is that you'll not benefit from
std::valarray member functions (yeah, object-based programming means
that you can't reuse functions without using the data structure).
Fortunately, you can still use STL algorithms.
And, if ever, a std::valarray member was so efficient & cool that you
would want to use it at any cost, you can still convert your data
structure to std::valarry, just for the time of doing the operation,
and then, copy back the data from the resulting std::valarray, to your
well-defined data structure.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]