Re: Call virtual function in constructor

From:

Pavel <dot_com_yahoo@paultolk_reverse.yourself>

Newsgroups:

comp.lang.c++

Date:

Sun, 17 Feb 2008 01:53:08 GMT

Message-ID:

<8sMtj.613578$kj1.480617@bgtnsc04-news.ops.worldnet.att.net>

Alf P. Steinbach wrote:

* Pavel:

It is sometimes useful to initialize different classes of a hierarchy
with the derived class-dependent code and then return back to the base
class constructor to execute some of its code again to avoid
duplicating that latter common code.

For example (this code will not work in C++ but the analogous code
will work in other programming languages (e.g. Java) and I do not see
any fundamental design flaws in this code).:

typedef std::map<std::string> ConnectionParameters;
class FooConnection {
protected:
    virtual void init(const ConnectionParameters &pars) = 0;
public:
    FooConnection(const ConnectionParameters &pars) {
        init(pars);
        validateConnection();
    }
private:
    void validateConnection()
        throw(FooConnectionException /*defined elsewhere*/)
    {
        /* perform some uniform validation here, for example
            some select from "FOO_MAIN_TABLE" */
    }
};
class OracleFooConnection : public FooConnection {
protected:
    void init(const ConnectionParameters &pars) {
        // .. do Oracle-specific initialization
    }
};
class MySqlFooConnection : public FooConnection {
protected:
    void init(const ConnectionParameters & pars) {
        // .. do MySql-specific initialization
    }
};

"not ... any fundamental design flaws": heh, it is reportedly the most
common source of Java bugs.

The problem is that at the time the derived class' function
implementation is called, the derived class object has not yet been
initialized. Thus, member functions called from that function, or even
that function's own implementation, may very easily execute code that
relies on assumptions that have not yet been established. Apart from
run-time checking of array downcasts, which is also a strong contender,
I think that this is the most ugly type system breach in Java.

Thanks Alf!

I totally agree the objects should not be operated on until completely
initialized.

However, in this particular case we are talking about the initialization
itself and how conveniently the initialization can be performed and
factored onto the most relevant pieces of code, rather than about
regular operations. It can hardly be argued that initialization is
supposed to operate on uninitialized objects.

I also agree that every powerful feature of any language can be abused.
Well, both C++ and Java provide a rich variety of ways to abuse them
(arguably, C++ provides more) and one more is not going to change the
weather.

This way, you can perform all validation in the constructor, that is,
according to the best practices, and without duplicating the common
validation code in the derived class.

The above code is (unfortunately) common practice in Java, but it's
certainly not best practice.

Well, I just said that keeping all validations in the constructor agrees
with the best practices because the client code then "never" has an
access to an invalid object.

It's an example of the exact opposite.

It is an anti-pattern.

I did not call the code above a pattern but "anti-pattern" seems little
"out of wack" to me :-). Why don't we try to refrain from tagging or
rubber-stamping each other's examples?

It is a C++ - specific feature that the implementation is not required
to construct the memory layout for the whole object of the most
derived class before calling the first constructor of a base class
(Java does it differently).

Of course, there are ways in C++ to cure this limitation, for example
by composition, where you can create a parallel Impl hierarchy which
does not validate, use member access control and "friends" to make its
object accessible only from within the primary hierarchy classes, move
database-specific init() into the parallel hierarchy and leave the
validateConnection() in the main hierarchy's base class. Sometimes
this added design complexity will be not much of a burden, sometimes
it will be. Personally I would prefer to have a choice not to use it.

For ways to achieve dynamic binding during initialization (DBDI) in C++,
which also are more sane ways in Java, see FAQ item 23.6.

I think of that as "my" FAQ item since I convinced Marshall to include
it, but the text and exposition is of course Marshall's.

Unfortunately this happened much later than the treatment of clone
functions, so we're stuck with the term "virtual construction", at least
in the FAQ, referring to cloning, and the acronym "DBDI" (Marshall's
invention) for the techniques discussed in 23.6.

Well, I have read that FAQ entry carefully and I cannot agree it
suggests a clearly better solution to the original problem for which I
suggested that virtual calls in constructor would be useful. I stated
the problem in words in the beginning of my previous post, but for more
clarity let me illustrate it with a little client code snippet here.
Imagine that

#include <foo.h>

int main(int argc, char *argv[]) {
    ConnectionParameters params(argv[1]);
/* this time, please assume that ConnectionParameters can be built from
a string argv[1] or a text file with this name -- seems reasonable */
    OracleFooConnection fooConnection(params);
    fooConnection.doSomethingUseful();
    return 0;
}

is all our client is willing to write in his/her code. Fair and square.
His/her business requirements are this simple and if a programming
language does not allow similar level of simplicity in the solution, it
will not be chosen (and neither will be our Foo library we happened to
write in that programming language and needless to say we will not have
luxury of programming in that language much longer no matter how badly
we love that language). With the requirements in mind, the first
approach, first "joe_user" function does not work. Also, it would be
less safe than my code above because, my init() function is protected
and it takes some creativity from the client to call it incidentally
(which is the only way to "access non-initialized object" even though
there is nothing wrong in calling init() on uninitialized object) and
the FAQ's init() is public and open for any misuse (calling at the wrong
time or being forgotten at all which is worse in our case).

The second variation of the first FAQ approach could be a little better
but it leaves out too many details (most notably, how to initialize the
factory and why would Base class have all necessary parameters for this
without a client code cooperation and how bulky such cooperation could
be). In fact, it does not give any example of the client code
("joe_func()" or whatever). It also assumes that the Base class knows
how to process *all* information needed to create the object of *any*
derived class (in terms of the parameter types) and this may easily lead
to bad hacking in the future when a new type of FooConnection comes
(say, LdapOracleFooConnection that will take a handle to LDAP connection
as a parameter instead of ConnectionParameters. Then, the
LdapOracleFooConnection class's constructor could build the
ConnectionParameters object required by the FooConnection's constructor
but with the FAQ's first approach, second variation it cannot be done
because the parameters of the Base constructor cannot hold the
connection handle).

The second approach (a second class hierarchy) is essentially similar to
what I suggest in my post as a "cure" for the limitation) with the
difference that I would hide the helper objects inside the primary
objects -- maybe matter of taste but seems more straightforward to me
than the "magic" constructor parameters. As I said, in some situations
the added complexity of the implementation can be justified. Way too
often, however, the inability to *easily* structure the initialization
code in the most rational way will lead to pretty ugly design
compromises. Some developers, even quite experienced once, who would
come to term with a single class hierarchy, under time pressure will
create a do-it-all FooConnection class, which will remember whether it
is Oracle or MySql (with all pleasantries of the correspondent state
management) instead of managing two parallel hierarchies. In a different
situation, when the parallel hierarchy approach is implemented
correctly, a new-to-the-project developer who is tasked to add a new
BarFooConnection will still have some nice time understanding what
exactly s/he should do and why the single responsibility (construction,
in our case) must be scattered across 2 classes.

I do not want even discuss the second approach, stateless case because
it is not applicable to my problem (which requires the connection state)
and even if it were, I would feel conscious leaving this mini-framework
to my fellow developer to maintain and re-use. Function pointers in
constructor parameters would make him/her suspect I was doing something
terribly smart, solving some rocket-science problem of a very dynamic
nature -- which I was not, I was just trying to reasonably factor my
initialization code.

Again, I am not trying to say the parallel hierarchies are *always
unconditionally worse* than virtual calls in constructors would be. My
humble point is that it is at least arguable that the benefits of
"disabling" (quoting because of the possibility of casting) one
relatively rarely met language abuse (yes, even in Java it does not
happen often, it is just that Java crowd is on average younger and likes
to make big noise of every wheel they re-invent and every old snag they
hit) outweigh the drawbacks of alternative solutions suggested by your
FAQ entry or by me.

I allow there could be better alternative solutions (than both in your
FAQ entry and the one I would use) but the very fact we do not know
about them immediately is a good indication that the language feature we
discuss may have an issue. This said, I would certainly like to hear
about such a better solution.

Cheers, & hth.,

- Alf

Regards,
-Pavel