Re: Multiple Inheritance vs. Interface

From:

Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>

Newsgroups:

comp.lang.c++

Date:

Wed, 10 Oct 2012 00:28:52 -0400

Message-ID:

<5074f98d$0$27153$c3e8da3$66d3cc2f@news.astraweb.com>

Stuart wrote:

[the OP, "lieve again", observed the problem that multiple inheritance of either
interface or non-interface classes leads to object bloat due to the necessity to
stuff objects with multiple vtables]

lieve again wrote:

Ok, so its a problem suffered from all the actual programming
languages, I think it could be a kind of limitation to obtain so big
objects, but its so.
Maybe the way to impose some kind of properties or functions to a
class without the vpointers replication penalty is the concepts
extension of C++11.

On 10/7/12 Pavel wrote:

I think the above is not accurate. C++ code does suffer performance
penalties from using multiple inheritance. Moreover, and what's
especially frustrating, even the code that does not use multiple
inheritance (in fact, any code using virtual functions) suffers from at
least one performance penalty imposed by the way C++ supports multiple
inheritance: the necessity to read the offset of the call target

What's the call target? Never heard this term.

The pointer to the base class on which a virtual function is called.

within
the object of the most-derived class overriding the virtual method and
subtracting this offset from the passed pointer to let the virtual
function implementation access to the object it expects.

I don't get what you mean. Can you give an example?

Because the compiler does not know (generally, Richard gave a good algo that can
solve the issue -- but for cost) whether the base is the first base in the
particular most-derived class, it has to read the offset of the sub-object in
the object in which the virtual table is defined and subtract it from given
pointer to the base class, at run-time.

Languages with single inheritance can assign a single offset from the
start of the virtual table of the most-derived class of an object to the
start of the slice of that class' virtual table correspondent to the
virtual table of any of its bases.

Yeah, for C++ this offset will always be zero.

For calling a virtual functions defined in a class with multiple bases by base
pointer, it will be zero only if the base is the first base (assuming without
loss of generality that the compiler allocates first base at the lowest
address); otherwise it will be something else. But, even if it is zero, it is
not known in advance to the compiler so the code will still have to read that
zero from memory and subtract it. That extra memory read can be relatively
expensive (subtraction is usually not).

This effectively means that any class
in such a language can have a single virtual table and the objects of
the most derived class and the correspondent objects of all its base
classes can have a single address.

Right. So casting a Derived* pointer to a Base* pointer for single-inheritance
chains will always be a noop under C++.

Right.
I don't see any kind of performance

penalty.'

The performance penalty will is incurred to cat Base* to Derived* which is what
happens when you call Derived's overridden virtual function by pointer to Base.

[snip]

C++ does not have a chance of assigning any virtual function once
defined in a class a single offset in a virtual table; therefore, it has
to have multiple virtual tables. As it is, that is without the
complication mentioned above, C++ can not let its compiler know at a
virtual call site that the call is on the object that is the first base
of the most-derived class; hence the necessity to always read and apply
the offset at run-time.

Which offset are you talking about? Can you give an example (preferably for the
Intel architecture)?

It's more compiler-specific than hardware-platform-specific. Imagine, compiler
lays out objects with virtual functions by putting virtual table pointer before
an object; and, for multiple inheritance, it places base sub-objects at the
beginning of derived objects, in the order of its base specifier list. Then for
these classes:

// file b.h
struct B { int b; virtual int getBOffset() const
    { return 0; }
};
// file b2.h
struct B2 { int b2; };
// file d.h
#include "b.h"
#include "b2.h"
struct D: public B, public B2 { int d; virtual int getBOffset() const; };
// file d.cpp
#include "d.h"
int D::getBOffset() const {
  return (const char*)this - (const char*)(const B*)this;
// not zero, most likely sizeof(int)
}
// file d2.h
#include "b.h"
#include "b2.h"
struct D2: public B2, public B { int d2; virtual int getBOffset() const; };
// file d2.cpp
#include "d2.h"
int D2::getBOffset() const {
  return (const char*)this - (const char*)(const B*)this;
// most likely, zero
}
// file x.cpp
#include "d.h"
#include "d2.h"
B *createB1() { return new D1(); }
B *createB2() { return new D2(); }
// file client.cpp
#include "b.h"
#include "x.h"
B *createB1();
B *createB2();
B *bPtr = createB1();
int o1 = bPtr->getBOffset();
B *bPtr2 = createB2();
int o2 = bPtr2->getBOffset();

Above, we know that to call bPtr2->getBOffset() (which is actually
D2::getBOffset()) compiler does not have to subtract anything from *bPtr2, but
compiler does not (D and D2 are not visible in client.cpp and neither are the
definitions of createB1 or createB2. Therefore, compiler has to generate th code
that reaches for the virtual table of bPtr2 and retrieve that zero (sometimes
stored at some negative address in the virtual table and sometimes in other
ways). As for the clever trick with double-virtual table and thunks described by
Richard Damon to avoid that, it will work but it comes at some cost for calling
virtual functions on non-second base and thus is not always employed (I will
stop at it later in the answer to his post).

Regards,
Stuart

HTH
-Pavel