Re: C++ equivalent of C VLAs and subsequent issues with offsetof
On 6/1/07 6:54 AM, in article
slrnf5ukln.v23.usenetspam01@fermat.hashpling.org, "Charles Bailey"
<usenetspam01@hashpling.org> wrote:
I have a C++ class which is currently not strictly conforming as it
uses offsetof on a non-POD struct, but I was hoping that someone could
suggest what the best approach to making it conforming would be, with
going completely down the C route or introducing any extra performance
overheads.
The MyBuffer class is not in the least conforming to the C++ Standard - and
its problems have nothing to do with any use of the offsetof macro. The real
problem is that the MyBuffer class declares a 2GB character array as a
member - yet the MyBuffer allocation function allocates only a fraction of
memory needed.
The class is a shareable buffer implementation which is designed for
use as a private inner class for a string implementation in which the
primary goals are low memory overhead and fast copies, but
modification of strings after construction is not required (i.e. no
append, no resize and no functions returning modifiable iterators of
pointers to the controlled string). The implementation strategy is to
have a shareable unmodifiable buffer with a reference count.
The first is the rather ugly definition of m_str, as a char[MAX_LEN].
The traditional "struct hack" uses char[1], but strictly this causes
undefined behaviour when you access m_str[n] for n > 0 even if the
class has been constructed in place with sufficient memory. It would
be nice if C++ supported C90 style VLAs.
A variable length array (VLA) would not useful here - even if C++ supported
them. The fact that the a MyBuffer immutable string object has to be
shareable means that its lifetime is indeterminate and must therefore be
dynamically-allocated (on the heap). A VLA on the other hand may be
allocated only on the stack (that is, a VLA may be locally declared only
within block scope).
In C++ there is no need to index beyond the declared size of an array,
because a C++ class can easily add dynamic memory management to a
fixed-sized struct in a manner that C cannot. Granted, a C99 variable-length
array (VLA) tries to add dynamic memory allocation facilities to a C data
structure (an array), but the result - a VLA - ends up a "worst of both
worlds" combination. A VLA has practically all of the drawbacks of a
fixed-sized array and yet with few of the benefits of dynamic memory
management.
The second is the use of offsetof which is technically illegal. To
make the class a POD-struct to allow the use of offsetof I would have
to lose the constructor and make the data members public. Needless to
say, this is very undesirable as it opens the internals to clients and
also runs the risk of clients attempting to allocate the full 2GB
struct.
The fact that MyBuffer's allocation function allocates only a tiny fraction
of the two gigabytes (or so) requested - yet returns a pointer as if the
allocation had actually succeeded - is a far more egregious (and alarming)
example of undefined behavior than practically anything else that the
program could choose to do.
The requirements for a pointer returned by a C++ allocation function are
clear:
"The allocation function attempts to allocate the requested amount of
storage. If it is successful, it shall return the address of the start of a
block of storage whose length in bytes shall be at least as large as the
requested size. "[?3.7.3.1/2]
The fact that the allocated size of a MyBuffer class object is not equal to
sizeof(MyBuffer) (in fact the allocation is likely well over 2,000,000,000
bytes short) renders all calculations with MyBuffer pointers (or MyBuffer
indexes) completely wrong. Furthermore allocating a MyBuffer array would
clearly be disastrous to a program that attempted it. In short, any C++
program whose memory allocations are at all at odds with C++'s memory
allocation requirements, operates outside of the defined bounds of the C++
language.
Any suggestions would be greatly appreciated. The only other solution
that I could think of was to go the C route completely and define a
bunch of functions working on a struct pointers for which the struct
is not publicly defined but this is far from the desirable C++ style
that I was attempting to get with my initial C++ class.
C++ is a "higher" level language than C. In other words, many tasks that in
C that would require a custom implementation - can often in C++ be assembled
instead from library components. In the case of MyBuffer's requirements, I
really see little reason why a C++ program would not use a:
std::tr1::shared_ptr< const std::string>
(or the comparable boost::shared_ptr< const std::string> ) for a shareable
reference-counted immutable string class object and would do so over a
custom MyBuffer class (even one that had well-defined behavior).
After all, the savings in implementation, testing and maintenance costs
realized from being able to use an "off the shelf" library implementation -
combined with a standardized - and well-known - programming interface -
seems hardly to require much of a decision at all.
Greg
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]