Re: The C++ Object Model: Good? Bad? Ugly?

From:

tonytech08 <tonytech08@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sat, 8 Nov 2008 04:08:26 -0800 (PST)

Message-ID:

<33ae85c9-7d50-428e-a45b-434669762cca@v22g2000pro.googlegroups.com>

On Nov 8, 3:12 am, James Kanze <james.ka...@gmail.com> wrote:

On Nov 7, 10:20 pm, tonytech08 <tonytec...@gmail.com> wrote:

On Nov 7, 4:40 am, James Kanze <james.ka...@gmail.com> wrote:

On Nov 6, 11:29 pm, tonytech08 <tonytec...@gmail.com> wrote:

[...]

Thanks for reiterating my thought: C++ has more support
for OO with "full OO type objects".

More support than what.

More support for OO with "heavyweight" classes than for POD
classes.

You're not making sense. How does C++ have more support for OO
than for other idioms?

Why are you asking that when I said nothing of the sort? I said that
once you put a vptr into the data portion of an object (for example),
it's a different animal than a class without a vptr (for example!). I
distinguished these fundamentally different animals by calling them
"heavyweight" and "lightweight" classes/object (and apparently wrongly
POD-classes wrongly).

Moreso, I was concerned that other things one can do with a class,
such as defining overloaded constructors, may make code fragile
against some future or other current implementation of the language.
Who's to say (not me) that someone won't make a compiler that tacks on
or into a class object some other "hidden" ptr or something to
implement "overloaded constructors"? I don't care if code is generated
but I do care if the compiler starts aberrating the data portion.

C++ has support for "full OO type objects", if that's what
you need. Most of my objects aren't "full OO type objects",
in the sense that they don't support polymorphism. C++
supports them just as well.

I think I may be OK without polymorphism in "lightweight"
classes, but overloaded constructors sure would be nice. And
conversion operators. Can a POD class derive from a pure
abstract base class? That would be nice also if not.

And C++ supports all of that.

But am I guaranteed that my a class will stay lightweight if I do that
or is it implementation defined?

I fail.to see what you're
complaining about. In C++, a class is as heavyweight or as
lightweight as its designer wishes. It is the class designer
who makes the choice, not the language.

That's not the case with polymorphism for example: the vptr in the
data portion is not free. It changes the size of the object. I want to
know where that line of crossover is or how to keep clear of it
anyway.

More than anything
else, it is this which sets C++ off from other languages.

and only because of constraints of C compatiblity. The
data portion of the class isn't the object in general.

I tend to think of the data portion (noun, vs.
behavior=verb) as "the thing" because that's what get's
operated on and maybe even directly manipulated.

In C++ (and even in C, for that matter), an object has a
type and an address; the type determines its size, and the
set of legal operations on it. Since an object is a thing,
in some way, I guess it is a noun, but even a POD struct has
behavior: you can assign it, for example, or access members.
Compared to C, C++ adds the ability for the user to define
additional operations (member functions), and to define
non-trivial initialization and destruction (which forces
significant changes in the object model). Beyond that, C++
adds support for dynamic typing (which is what one usually
understands with OO).

Not sure what your point is. I said that I consider the data
portion of an object, "the object".

But that's simply wrong,

No it's not.

Yes it is. Without behavior, all you have is raw memory. C++
is a typed language, which means that objects do have behavior.
(I'm not talking necessarily of behavior in the OO sense here.
In C as well, object have behavior, and the set of operations on
an int is not the same as the set of operations on a float.)

Context matters. You may like to design thinking of objects FIRST as
the set of methods that work on the data. I, OTOH, prefer to think in
terms of the data FIRST (and more importantly because that's what's
going to end up somewhere external to the program).

It's just an abstract way of looking at it. It's hardly a
stretch either, since the C++ object model or at least most
implementations use that as the foundation upon which to
implement polymorphism: tacking a vptr onto "the thing part"
(noun) of "the object".

C++ supports dynamic typing, if that's what you mean. In other
words, the type of an object vary at runtime. But I don't see
your point. It is the designer of the class who decides whether
to use dynamic typing or not. The language doesn't impose it.

It imposes "a penalty" the second you introduce the vptr. The class
becomes fundamentally and categorically different in a major way.
(Read: turns a lightweight class into a heavyweight one).

at least in the C++ object model. An object has a type.
Otherwise, it's just raw memory. That's a fundamental
principle of any typed language.

I could easily go further and say something like "the memory
in which the data portion of an object, is the object". While
that may bother purists, it is a valid abstract way of
thinking about it.

Not in a typed language. If you want raw memory, C++ even
supports that. Any object can be read as an array of unsigned
char. Of course, the representation isn't always defined; are
int's 16, 36, 36, 48 or 64 bits? Are they 2's complement, 1's
complement or signed magnitude?

I agree that there are other hindrances to having an elegant
programming model. Sigh. That's not to say that one can't get around
them to a large degree. (Not the least of which is: define your
platform as narrowly as possible).

I wasn't trying to be implementation literal about it.
Yes, data+behavior= class, but when the implementation
starts adding things to the data portion, that defines a
different animal than a POD class.

But the implementation *always* adds things to the data
portion, or controls how the data portion is interpreted.
It defines a sign bit in an int, for example (but not in an
unsigned int). If you want to support signed arithmetic,
then you need some way of representing the sign. If you
want to support polymorphism, then you need some way of
representing the type. I don't see your point. (The point
of POD, in the standard, is C compatibility; anything in a
POD will be interpretable by a C compiler, and will be
interpreted in the same way as in C++.)

Well maybe I'm breaking new ground then in suggesting that
there should be a duality in the definition of what a class
object is. There are "heavyweight" classes and "lightweight"
ones.

There's no strict binary division.

A class with a vptr is fundamentally different than one without, for
example.

There are a number of
different classifications possible

The only ones I'm considering in this thread's topic though is the
lightweight/heavyweight ones.

---at the application level,
the distinction between value objects and entity objects is
important, for example (but there are often objects which don't
fit into either category). In many cases, it certainly makes
sense to divide types into categories (two or more); in this
regard, about the only thing particular with "lightweight" and
"heavyweight" is that the names don't really mean anything.

I've described it at least a dozen times now, so if you don't get it,
then I'm out of ways to describe it.

I use C++ with that paradigm today, but it could be more
effective if there was more support for "object-ness" with
"lightweight" classes.

Again: what support do you want? You've yet to point out
anything that isn't supported in C++.

(Deriving from interface classes and maintaining the size of the
implementation (derived) class would be nice (but maybe impossible?)).

I am just trying to understand where the line of demarcation is
between lightweight and heavyweight classes is and how that can
potentially change in the future and hence break code.

The limitation appears to be backward compatibity with C. If
so, maybe there should be structs, lightweight classes,
heavyweight classes.

And maybe there should be value types and entity types. Or
maybe some other classification is relevant to your application.
The particularity of C++ is that it lets you choose. The
designer is free to develop the categories he wants. (If I'm
not mistaken, in some circles, these type of categories are
called stereotypes.)

I'm only talking about the two categories based upon the C++
mechanisms that change the data portion of the object. Deriving a
simple struct from a pure abstract base class will get you a beast
that is the size of the struct plus the size of a vptr. IOW: an
aberrated struct or heavyweight object. Call it what you want, it's
still fundamentally different.

[...]

It restricts the use of OO concepts to classes designed to
be used with OO concepts.

Not really, since one can have POD classes with methods,
just not CERTAIN methods (you are suggesting that "classes
designed to be used with OO concepts" are those
heavyweight classes that break PODness, right?).

No. I'm really not suggesting much of anything. However you
define the concept of OO, the concept only applies to classes
which were designed with it in mind. C++ doesn't force any
particular OO model, but allows you to chose. And to have
classes which aren't conform to this model.

"Allows you to choose"? "FORCES you to choose" between
lightweight (POD) class design with more limited OO and and
heavyweight (non-POD) class design with all OO mechanisms
allowed but at the expense of losing POD-ness. It's a
compromise. I'm not saying it's a bad compromise, but I am
wondering if so and what the alternative implementation
possibilities are.

Obviously, you have to choose the appropriate semantics for
the class. That's part of design, and is inevitable. So I
don't see your point; C++ gives you the choice, without
forcing you into any one particular model. And there aren't
just two choices.

The change occurs when you do something to a POD
("lightweight") class that turns the data portion of the class
into something else than just a data struct, as when a vptr is
added. Hence then, you have 2 distinct types of class objects
that are dictated by the implementation of the C++ object
model.

The concept of a POD was introduced mainly for reasons of
interfacing with C. Forget it for the moment. You have as many
types of class objects as the designer wishes. If you want just
a data struct, fine; I use them from time to time (and they
aren't necessarily POD's---it's not rare for my data struct's to
contain an std::string). If you want polymorphism, that's fine
too. If you want something in between, say a value type with
deep copy semantics, no problem.

There is NO restriction in C++ with regards to what you can do.

Yes there is if you don't want the size of your struct to be it's size
plus the size of a vptr. If maintaining that size is what you want,
then you can't have polymophism. Hence, restriction.

You seem to be saying that POD classes are not supported
or at least not encouraged.

Where do I say that? POD classes are definitely supported,
and are very useful in certain contexts. They aren't
appropriate for what most people would understand by OO, but
so what. Not everything has to be rigorously OO.

You seemed to imply that the "supported" ("ecouraged" would
probably be a better word to use) paradigms were: A. data
structs with non- trivial member functions and built-in
"behavior" and B. "full OO type objects".

Not at all. You define what you need.

There are the limitations though: you can't have overloaded
constructors, for example, without losing POD-ness.

Obviously, given the particular role of PODs. So? What's your
point?

My point is that I'm worried about defining some overloaded
constructors and then finding (now or in the future) that my class
object is not "struct-like" anymore (read, has some bizarre
representation in memory).

There are ony two reasons I know for insisting on
something being a POD: you need to be able to use it from C as
well, or you need static initialization. Both mean that
construction must be trivial.

Or conversion operators (?).

A conversion operator doesn't affect POD-ness. In fact, POD
structures with a conversion operator are a common idiom for
certain types of initialization.

Well that's good to know.

Or derivation from "interfaces" (?).

How is a C program going to deal with derivation? For that
matter, an interface supposes virtual functions and dynamic
typing; it's conceptually impossible to create a dynamically
typed object without executing some code.

Code generation/execution is not what I'm worried about.

You do have to know what you want.

I do.

(And this has nothing to do
with the "design" of the C++ object model;

It does.

it's more related to
simple possibilties. C++ does try to not impose anything
impossible to implement.)

Base upon your comments, maybe polymorphism IS the only thing that
changes a class from lightweight to heavyweight. I'm not sure that
that can be relied upon with future implementations or even different
implementations of the language, for I think that the mechanisms are
mostly implementation defined.

From a design point of view, I find that it rarely makes
sense to mix models in a single class: either all of the
data will be public, or all of it will be private. But the
language doesn't require it.

That would be a real downer if true. I'd like to see more
support in the langauge for POD classes.

Such as?

What I call "initializing constructors" for one thing.
(Constructors that take arguments to initialize a POD class in
various ways).

Well, if there is a non-trivial constructor, the class can't
be POD, since you need to call the constructor in order to
initialize it.

Well maybe then "POD" is the hangup and I should have used
"lightweight" from the beginning. I just want the data portion
to remain intact while having the constructor overloads and
such.

I'm not sure what you mean by "the data portion to remain
intact".

Derive a class and you have compiler baggage attached to the data
portion. If I ever instantiate a class object that has overloaded
constructors and find that the size of the object is different from
the expected size of all the data members (please don't bring up
padding and alignment etc), I'm going to be unhappy.

Taken literally, the data portion had better remain
intact for all types of objects. If you mean contiguous, that's
a different issue: not even POD's are guaranteed to have
contiguous data (since C doesn't guarantee it)---on many
machines (e.g. Sparcs, IBM mainframes...) that would introduce
totally unacceptable performance costs.

If a platform is so brain-damaged that I can't do things to have a
high degree of confidence that the size of a struct is what I expect
it to be, then I won't be targeting that platform.
Other people can program "the exotics".

If anything, C++ specifies the structure of the data too much.
A compiler is not allowed to reorder data if there is no
intervening change of access, for example. If a programmer
writes:

    struct S
    {
        char c1 ;
        int i1 ;
        char c2 ;
        int i2 ;
    } ;

for example, the compiler is not allowed to place the i1 and i2
elements in front of c1 and c2, despite the fact that this would
improve memory use and optimization.

And I think I have control over most of those things on a given
platform. Which is all fine with me, as long as I HAVE that control
(via compiler pragmas or switches or careful coding or whatever).

Polymorphism I can probably do without, but deriving from
interfaces would be nice if possible.

If you have dynamic typing, some code must be executed when the
object is created; otherwise, there is no way later to know what
the dynamic type is.

Again, I'm not worried about code generation/execution.

Anything else would be a contradiction: are you saying you
want to provide a constructor for a class, but that it won't
be called?

Of course I want it to be called. By "POD-ness" I just meant I
want a struct-like consistency of the object data (with no
addition such as a vptr, for example).

I don't understand all this business of vptr. Do you want
polymorphism, or not.

Yes, but without the vptr please (coffee without cream please).

If you want polymorphism, the compiler
must memorize the type of the object (each object) somewhere,
when the object is created; C++ doesn't require it to be in the
object itself, but in practice, this is by far the most
effective solution.

But what if just pure ABC derived classes were handled differently?
Then maybe the situation would be less bad.

If you don't want polymorphism, and don't
declare any virtual functions, then the compiler doesn't have to
memorize the type of the object, and none that I know of do.

[...]

Because you can't have a constructor in C, basically.
Because the compiler must generate code when the object is
created, if there is a constructor. That is what POD-ness
is all about; a POD object doesn't require any code for it
to be constructed.

Then apparently I was using "POD" inappropriately. My concern
is the in-memory representation of the object data.

Which is implementation defined in C++, just as it was in C.
With some constraints; the compiler can insert padding (and all
do in some cases), but it cannot reorder non-static data members
unless there is an intervening change in access control.

Well there's another example then of heavyweightness: sprinkle in
"public" and "private" in the wrong places and the compiler may
reorder data members. (I had a feeling there was more than the vptr
example).

That,
and the fact that a class cannot have a size of 0, are about the
only restraints. C (and C++ for PODs) also have a constraint
that the first data element must be at the start of the object;
the compiler may not introduce padding before the first element.

So you are saying that a non-POD does not have to have the first data
element at the start of the object. Example number 3 of
heavyweightness. (NOW we're getting somewhere!).
So "losing POD-ness" IS still "bad" and my assumed implication of that
and use of "POD-ness" seems to have been correct.

If I'm not mistaken, the next version of the standard extends
this constraint to "standard-layout classes"; i.e. to classes
that have no virtual functions and no virtual bases, no changes
in access control, and a few other minor restrictions (but which
may have non-trivial constructors). This new rule, however,
does nothing but describe current practice.

So in the future I will be able to have overloaded constructors (I'm
not sure what exactly a "trivial" constructor is, but I assumed that
an overloaded one is not trivial) and still have lightweight classes,
good. That threat of a compiler not putting data at the front of non-
PODs is a real killer.

[...]

Maybe defining POD-ness as "C compatibility of structs" is
a hindrance, if it is defined something like that.

The next version of the standard does have an additional
category "layout compatible". I'm not sure what it buys us,
however.

Where can I read up on that?

In the current draft. I thinkhttp://www.open-std.org/jtc1/sc22/wg21/do=

cs/papers/2008/n2798.pdf

should get it (or something close---the draft is still
evolving).

Downloaded it. Thx for the link.

Tony