Re: compilers, endianness and padding
On Tue, 14 May 2013 15:08:22 CST
Bart van Ingen Schenau <bart@ingen.ddns.info.invalid> wrote:
Trees are not that difficult to serialize. How about a slightly more
complex structure:
class X {
struct t {
size_t a;
char* b;
};
As I mentioned elsewhere, it's necessary in the general case for the
compiler to provide the extent as well as the value of a pointer. IOW
sizeof(X::t::b) == sizeof(char*)
X x;
x.t.b = char s[10];
extentof(x.t.b) == 10;
Every pointer -- static, free store, or automatic -- always has some
number of bytes allocated to it. (That number might be zero.) The
language deficiency is that it does not make that information
available to the programmer. Instead, it requires the programmer to
track it independently and duplicatively. And often, it might be
noted, incorrectly.
Someone will object that keeping track of the size of memory allocated
to a pointer will add 8 bytes to every pointer. Not true! Remember,
every time you say
char *s = "hello";
the compiler set aside those 6 bytes and placed the next variable
*after* them. Change it just a little
char s[] = "hello";
and suddenly sizeof(s) works. Yet the pointer is the same size. Move
to the heap
char *s = malloc(6);
and the heap must do as the compiler does, setting aside 6 bytes. I'm
simply pointing out that the language could expose that fact with
extentof(s);
at *no* cost. Not just a little: none. The information is already
there, in the executable image, or on the stack, or in the free store.
What's missing is a bit of syntax.
size_t c;
union {
char d[sizeof(t)];
t e;
} f;
};
At first glance, this seems no problem at all, insofar as sizeof(f) is
known at compile time. The problem I think you're alluding to is that
two different compilers might arrange f differently, and nothing about
the bit pattern of the union tells us what to do.
My answer is simple, once again, although at a trivial cost. It must
be possible to know which member of f was last written. Why? Because
if f.t was written, serialization demands its endianism be honored.
One might hope, though, that this sort of malarky might fade into
history if endianism were dealt with in the language proper.
--jkl
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]