Re: C++ equivalent of C VLAs and subsequent issues with offsetof
On 2007-06-01, Mathias Gaunard <loufoque@gmail.com> wrote:
Is
struct MyBuffer
{
int refcount;
char str[];
};
struct string
{
MyBuffer* foo;
};
really better than
struct string
{
int refcount;
char* str;
};
?
The second form allows you to construct strings from literals for
free.
But does the first one allow better usage of lock-free techniques?
I'm not sure I follow your example. The reference count on the buffer
is supposed to tell you when you can safely free the string, the
string(s) are supposed to (collectively) own the buffer and the string
class is not designed or required to ever point to string literals.
The reference count has to be attached to the buffer, it can't be
attached to the string, otherwise you just have copies of the
reference count. I've omitted any locking code as the code is for a
single threaded application. A mutex of some sort could easily be
added to the buffer class to protect the reference count access. As
the buffer is shared but non-modifiable, access to the length and
buffer contents don't need to be protected.
Two 'natural' alternatives to my original implementation are:
class MyBuffer
{
// construtors / destructors left out
int refcount;
// Allocated dynamically on buffer construction, deallocated in
// the buffer destructor
char* realbuffer;
};
and
class string
{
// refcount and str are allocated and copied in parallel, all
// methods required to keep them in sync
int* prefcount;
char* str;
};
Both of these double the number of allocation calls required when
creating a buffer. You either need one for the struct and one for the
char* in the first case, or one for the reference count and one for
the char* in the second. This did actually have a measurable and
significant performance cost in testing, so I was reluctant to ditch
the original design.
The first is the rather ugly definition of m_str, as a char[MAX_LEN].
The traditional "struct hack" uses char[1], but strictly this causes
undefined behaviour when you access m_str[n] for n > 0 even if the
class has been constructed in place with sufficient memory. It would
be nice if C++ supported C90 style VLAs.
C99, you mean.
char[1] seems fine to me.
Yes, I spotted my C90 typo (or thoughto) after I posted it, but with
the moderation delay I didn't think it worth correcting myself until I
was replying later.
If you use char[1] you definitely get UB according to my reading of
the C++ standard (2003), whether or not you allocate the memory
correctly. I'm not a proficient standard interpreter though, so could
well be wrong. char[MAX_LEN] is designed to avoid this but causes gdb
to crash rather spectacularly when trying to examine the buffer.
your struct only exists as an implementation detail.
As such, you should put it in a "detail" namespace for example,
telling users not to touch it.
Boost libraries do that a lot, for example.
I agree, in "real life" it is a private inner (or nested) class of the
"full" string implementation.
Thanks for your help, I have now though of a new design which solves
the offsetof issue, while maintaining most of the class features of
the original design. It elicits no warning from gcc, but I'm not yet
sure that it is strictly conforming.
I'm relying on the fact that deriving an 'almost' POD from a
POD-struct will not attempt to add any secret data structures at the
end of the class. I have no extra data members and no virtual
functions so this should be the case, but I'm not sure whether this
might be implementation defined behaviour. Because a derived class
pointer can be converted to a base class pointer I am, though,
guaranteed that the layout of the common members in the derived class
will be identical to that of the base class.
I've snipped all the implementations that are identical to the
previous implementation in a futile guesture of brevity.
// mybuffer2.h
#ifndef MYBUFFER2_H
#define MYBUFFER2_H
#include <cstring>
namespace MyBuffer2PrivateImpl
{
struct MyPODBuffer
{
static const std::size_t MAX_LEN = 0x7ffffff0u;
int m_refcount;
std::size_t m_length;
char m_str[MAX_LEN];
};
}
class MyBuffer2 : private MyBuffer2PrivateImpl::MyPODBuffer
{
public:
static MyBuffer2* Create(const char* init, std::size_t len);
void operator delete(void* p);
const char* c_str() const { return m_str; }
std::size_t length() const { return m_length; }
void addref() { ++m_refcount; }
int remref() { return --m_refcount; }
private:
// Private constructor, must only be constructed
// via the Create method.
MyBuffer2(const char* str, std::size_t len)
{
m_refcount = 1;
m_length = len;
if (str != 0 && len != 0)
{
std::memcpy(m_str, str, len);
}
}
void* operator new(std::size_t, std::size_t slen);
void operator delete(void* p, std::size_t);
};
#endif//MYBUFFER2_H
// end of file
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]