Re: The C++ Object Model: Good? Bad? Ugly?

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Sat, 8 Nov 2008 01:12:08 -0800 (PST)

Message-ID:

<8809535b-1527-4b9a-9fce-01cb8c1caaab@p35g2000prm.googlegroups.com>

On Nov 7, 10:20 pm, tonytech08 <tonytec...@gmail.com> wrote:

On Nov 7, 4:40 am, James Kanze <james.ka...@gmail.com> wrote:

On Nov 6, 11:29 pm, tonytech08 <tonytec...@gmail.com> wrote:

[...]

Thanks for reiterating my thought: C++ has more support
for OO with "full OO type objects".

More support than what.

More support for OO with "heavyweight" classes than for POD
classes.

You're not making sense. How does C++ have more support for OO
than for other idioms?

C++ has support for "full OO type objects", if that's what
you need. Most of my objects aren't "full OO type objects",
in the sense that they don't support polymorphism. C++
supports them just as well.

I think I may be OK without polymorphism in "lightweight"
classes, but overloaded constructors sure would be nice. And
conversion operators. Can a POD class derive from a pure
abstract base class? That would be nice also if not.

And C++ supports all of that. I fail.to see what you're
complaining about. In C++, a class is as heavyweight or as
lightweight as its designer wishes. It is the class designer
who makes the choice, not the language. More than anything
else, it is this which sets C++ off from other languages.

and only because of constraints of C compatiblity. The
data portion of the class isn't the object in general.

I tend to think of the data portion (noun, vs.
behavior=verb) as "the thing" because that's what get's
operated on and maybe even directly manipulated.

In C++ (and even in C, for that matter), an object has a
type and an address; the type determines its size, and the
set of legal operations on it. Since an object is a thing,
in some way, I guess it is a noun, but even a POD struct has
behavior: you can assign it, for example, or access members.
Compared to C, C++ adds the ability for the user to define
additional operations (member functions), and to define
non-trivial initialization and destruction (which forces
significant changes in the object model). Beyond that, C++
adds support for dynamic typing (which is what one usually
understands with OO).

Not sure what your point is. I said that I consider the data
portion of an object, "the object".

But that's simply wrong,

No it's not.

Yes it is. Without behavior, all you have is raw memory. C++
is a typed language, which means that objects do have behavior.
(I'm not talking necessarily of behavior in the OO sense here.
In C as well, object have behavior, and the set of operations on
an int is not the same as the set of operations on a float.)

It's just an abstract way of looking at it. It's hardly a
stretch either, since the C++ object model or at least most
implementations use that as the foundation upon which to
implement polymorphism: tacking a vptr onto "the thing part"
(noun) of "the object".

C++ supports dynamic typing, if that's what you mean. In other
words, the type of an object vary at runtime. But I don't see
your point. It is the designer of the class who decides whether
to use dynamic typing or not. The language doesn't impose it.

at least in the C++ object model. An object has a type.
Otherwise, it's just raw memory. That's a fundamental
principle of any typed language.

I could easily go further and say something like "the memory
in which the data portion of an object, is the object". While
that may bother purists, it is a valid abstract way of
thinking about it.

Not in a typed language. If you want raw memory, C++ even
supports that. Any object can be read as an array of unsigned
char. Of course, the representation isn't always defined; are
int's 16, 36, 36, 48 or 64 bits? Are they 2's complement, 1's
complement or signed magnitude?

I wasn't trying to be implementation literal about it.
Yes, data+behavior= class, but when the implementation
starts adding things to the data portion, that defines a
different animal than a POD class.

But the implementation *always* adds things to the data
portion, or controls how the data portion is interpreted.
It defines a sign bit in an int, for example (but not in an
unsigned int). If you want to support signed arithmetic,
then you need some way of representing the sign. If you
want to support polymorphism, then you need some way of
representing the type. I don't see your point. (The point
of POD, in the standard, is C compatibility; anything in a
POD will be interpretable by a C compiler, and will be
interpreted in the same way as in C++.)

Well maybe I'm breaking new ground then in suggesting that
there should be a duality in the definition of what a class
object is. There are "heavyweight" classes and "lightweight"
ones.

There's no strict binary division. There are a number of
different classifications possible---at the application level,
the distinction between value objects and entity objects is
important, for example (but there are often objects which don't
fit into either category). In many cases, it certainly makes
sense to divide types into categories (two or more); in this
regard, about the only thing particular with "lightweight" and
"heavyweight" is that the names don't really mean anything.

I use C++ with that paradigm today, but it could be more
effective if there was more support for "object-ness" with
"lightweight" classes.

Again: what support do you want? You've yet to point out
anything that isn't supported in C++.

The limitation appears to be backward compatibity with C. If
so, maybe there should be structs, lightweight classes,
heavyweight classes.

And maybe there should be value types and entity types. Or
maybe some other classification is relevant to your application.
The particularity of C++ is that it lets you choose. The
designer is free to develop the categories he wants. (If I'm
not mistaken, in some circles, these type of categories are
called stereotypes.)

[...]

It restricts the use of OO concepts to classes designed to
be used with OO concepts.

Not really, since one can have POD classes with methods,
just not CERTAIN methods (you are suggesting that "classes
designed to be used with OO concepts" are those
heavyweight classes that break PODness, right?).

No. I'm really not suggesting much of anything. However you
define the concept of OO, the concept only applies to classes
which were designed with it in mind. C++ doesn't force any
particular OO model, but allows you to chose. And to have
classes which aren't conform to this model.

"Allows you to choose"? "FORCES you to choose" between
lightweight (POD) class design with more limited OO and and
heavyweight (non-POD) class design with all OO mechanisms
allowed but at the expense of losing POD-ness. It's a
compromise. I'm not saying it's a bad compromise, but I am
wondering if so and what the alternative implementation
possibilities are.

Obviously, you have to choose the appropriate semantics for
the class. That's part of design, and is inevitable. So I
don't see your point; C++ gives you the choice, without
forcing you into any one particular model. And there aren't
just two choices.

The change occurs when you do something to a POD
("lightweight") class that turns the data portion of the class
into something else than just a data struct, as when a vptr is
added. Hence then, you have 2 distinct types of class objects
that are dictated by the implementation of the C++ object
model.

The concept of a POD was introduced mainly for reasons of
interfacing with C. Forget it for the moment. You have as many
types of class objects as the designer wishes. If you want just
a data struct, fine; I use them from time to time (and they
aren't necessarily POD's---it's not rare for my data struct's to
contain an std::string). If you want polymorphism, that's fine
too. If you want something in between, say a value type with
deep copy semantics, no problem.

There is NO restriction in C++ with regards to what you can do.

You seem to be saying that POD classes are not supported
or at least not encouraged.

Where do I say that? POD classes are definitely supported,
and are very useful in certain contexts. They aren't
appropriate for what most people would understand by OO, but
so what. Not everything has to be rigorously OO.

You seemed to imply that the "supported" ("ecouraged" would
probably be a better word to use) paradigms were: A. data
structs with non- trivial member functions and built-in
"behavior" and B. "full OO type objects".

Not at all. You define what you need.

There are the limitations though: you can't have overloaded
constructors, for example, without losing POD-ness.

Obviously, given the particular role of PODs. So? What's your
point? There are ony two reasons I know for insisting on
something being a POD: you need to be able to use it from C as
well, or you need static initialization. Both mean that
construction must be trivial.

Or conversion operators (?).

A conversion operator doesn't affect POD-ness. In fact, POD
structures with a conversion operator are a common idiom for
certain types of initialization.

Or derivation from "interfaces" (?).

How is a C program going to deal with derivation? For that
matter, an interface supposes virtual functions and dynamic
typing; it's conceptually impossible to create a dynamically
typed object without executing some code.

You do have to know what you want. (And this has nothing to do
with the "design" of the C++ object model; it's more related to
simple possibilties. C++ does try to not impose anything
impossible to implement.)

From a design point of view, I find that it rarely makes
sense to mix models in a single class: either all of the
data will be public, or all of it will be private. But the
language doesn't require it.

That would be a real downer if true. I'd like to see more
support in the langauge for POD classes.

Such as?

What I call "initializing constructors" for one thing.
(Constructors that take arguments to initialize a POD class in
various ways).

Well, if there is a non-trivial constructor, the class can't
be POD, since you need to call the constructor in order to
initialize it.

Well maybe then "POD" is the hangup and I should have used
"lightweight" from the beginning. I just want the data portion
to remain intact while having the constructor overloads and
such.

I'm not sure what you mean by "the data portion to remain
intact". Taken literally, the data portion had better remain
intact for all types of objects. If you mean contiguous, that's
a different issue: not even POD's are guaranteed to have
contiguous data (since C doesn't guarantee it)---on many
machines (e.g. Sparcs, IBM mainframes...) that would introduce
totally unacceptable performance costs.

If anything, C++ specifies the structure of the data too much.
A compiler is not allowed to reorder data if there is no
intervening change of access, for example. If a programmer
writes:

    struct S
    {
        char c1 ;
        int i1 ;
        char c2 ;
        int i2 ;
    } ;

for example, the compiler is not allowed to place the i1 and i2
elements in front of c1 and c2, despite the fact that this would
improve memory use and optimization.

Polymorphism I can probably do without, but deriving from
interfaces would be nice if possible.

If you have dynamic typing, some code must be executed when the
object is created; otherwise, there is no way later to know what
the dynamic type is.

Anything else would be a contradiction: are you saying you
want to provide a constructor for a class, but that it won't
be called?

Of course I want it to be called. By "POD-ness" I just meant I
want a struct-like consistency of the object data (with no
addition such as a vptr, for example).

I don't understand all this business of vptr. Do you want
polymorphism, or not. If you want polymorphism, the compiler
must memorize the type of the object (each object) somewhere,
when the object is created; C++ doesn't require it to be in the
object itself, but in practice, this is by far the most
effective solution. If you don't want polymorphism, and don't
declare any virtual functions, then the compiler doesn't have to
memorize the type of the object, and none that I know of do.

[...]

Because you can't have a constructor in C, basically.
Because the compiler must generate code when the object is
created, if there is a constructor. That is what POD-ness
is all about; a POD object doesn't require any code for it
to be constructed.

Then apparently I was using "POD" inappropriately. My concern
is the in-memory representation of the object data.

Which is implementation defined in C++, just as it was in C.
With some constraints; the compiler can insert padding (and all
do in some cases), but it cannot reorder non-static data members
unless there is an intervening change in access control. That,
and the fact that a class cannot have a size of 0, are about the
only restraints. C (and C++ for PODs) also have a constraint
that the first data element must be at the start of the object;
the compiler may not introduce padding before the first element.
If I'm not mistaken, the next version of the standard extends
this constraint to "standard-layout classes"; i.e. to classes
that have no virtual functions and no virtual bases, no changes
in access control, and a few other minor restrictions (but which
may have non-trivial constructors). This new rule, however,
does nothing but describe current practice.

[...]

Maybe defining POD-ness as "C compatibility of structs" is
a hindrance, if it is defined something like that.

The next version of the standard does have an additional
category "layout compatible". I'm not sure what it buys us,
however.

Where can I read up on that?

In the current draft. I think
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2798.pdf
should get it (or something close---the draft is still
evolving).

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34