Re: Polymorph in Place

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

11 May 2007 01:19:35 -0700

Message-ID:

<1178871575.542920.224970@h2g2000hsg.googlegroups.com>

On May 10, 9:49 pm, mar...@tuells.org (marcus hall) wrote:

I am considering a strategy for implementation of a finite state machine.
What I would like to do is to use derived classes to represent the state
of the machine, so the vtable pointer is the state and the virtual methods
are the inputs to the machine.

The heart of the issue is the following construct when changing state:

        switch (newstate) {
        case MT_IDLE: new(this) MT::IDLE(*this); break;
        case MT_WAIT_ACK: new(this) MT::WAIT_ACK(*this); break;
        case MT_WAIT_DATA: new(this) MT::WAIT_DATA(*this); break;
        }

This is part of a SetState() method of the base class. The base class has
the following:

        void *operator new(size_t, MT *mt) { return mt; }

so the placement new "allocates" the same memory that the object currently
occupies. The derived classes include a null copy constructor like:

        IDLE(const MT &) {}

So, each line in the switch above just changes the vtable pointer (at lea=

with g++ with optimization, that is all that is generated).

Does this idea of "polymorphing in place" violate the C++ standard anywhe=

re?

If so, is there any adjustment that could be done to make it compliant?

Yes and no.

First, I presume that the classes in question have virtual
functions. In which case, they also have a non-trivial
destructor, which (formally) must be called. And which means
that you cannot use *this as an argument to the constructor. So
you'd end up having to do something like:

    // Save all essential information in local variables...
    this->~Base() ;
    switch ( newstate ) {
    case MT_IDLE : new ( this ) MT::IDLE( /* saved information */ ) ;
                   break ;
    // ...
    }

Having done that, a lot depends on the context. If this is the
entire function, you're covered by the standard (supposing that
the actual memory has sufficient size and alignment for all of
the derived types). Trying to do anything further within the
function, however, is undefined behavior; the compiler has the
right to suppose that the type of *this doesn't change under
it's feet. (Note that this also means that you cannot call this
function from a member function of a derived class, and do
anything in the calling function afterwards. Something like:

    MT::Base*
    MT_SomeState::event( EventDescription const& event )
    {
        // ...
        return changeState( newState ) ;
    }

where changeState is the function above is OK, however.)

Whether this is a good idea is another question. Compilers can
often understand things that a human reader can't. I've often
found it useful to separate the functions and the data in such
cases, maintaining permenant instances of the polymorphic state
handling objects, and a separate instance of the data shared by
them. This avoids the switch, and allows keeping the instances
of the state in a table.

Why do you want to do this, rather than simply use dynamic
allocation? If it is purely for performance reasons, I suspect
that the allocation won't have a measurable impact. If you are
worried about fragmentation, a separate pool allocator could
take care of that---correctly designed, it could also handle any
possible performance issues as well.

Is there anything that looks to be problematic with this? Certainly the
derived classes cannot be allowed to increase the memory footprint of the
class, and I should be able to check that statically at compile time in t=

new() operator. Anything else that would be recommended?

Alignment and size are the major considerations. Other than
that, you aren't allowed to execute any code where the this
pointer might point to memory of the wrong type, nor where the
type of the object pointed to by this changes.

Although the standard seems to say that it is legal (albeit not
too clearly), I'd also be sceptical of using a pointer to the
object after calling the polymorphing function, without an
intervening assignment. Given code like the above, I'd very
definitly write:

    MT::Base* p = new MT::InitialState ;

    while ( waitForEvent ) {
        p = p->event( ... ) ;
    }

even if I knew that the value returned by p->event was the same
as the original value of p. (In fact, I would definitly write
the code this way, with event returning a new dynamically
allocated object, and only change to using a memory pool or some
other strategy if the profiler said it was definitly necessary.)

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
                   Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34