Re: Problem with inheritance and arbitrary "features" support (via templates).

From:

KRao78 <krunal.rao78@googlemail.com>

Newsgroups:

comp.lang.c++

Date:

Fri, 13 Nov 2009 03:38:10 -0800 (PST)

Message-ID:

<669957e1-5c1e-47f1-bf20-7f29206681ee@b2g2000yqi.googlegroups.com>

On 12 Nov, 17:08, "Alf P. Steinbach" <al...@start.no> wrote:

What's the point of hiding B::f in class D1 (which is still abstract)?

Hm, let's assume that was a typo.

But please, when posting code, copy and paste *working* code.

Yes, sorry I apologise for the mistake (a const was missing).
The other code I postes was tested (the did not complain and the
exectuable produced the expected result).

Example2:

class F {
public:
   virtual double f( double x ) = 0;
};

class G {
public:
   virtual double g( double x ) = 0;
};

class H {
public:
   virtual double h( double x ) = 0;
};

class N {};

template<class T1, class T2=N, class T3=N>
class Feature : public T1 , public T2 , public T3
{
};

Again, please copy and paste *working* code.

It works with the compiler I am using at the moment (Visual Studio
Professional 2008).
Moreover (at least to my understanding), it never inherits from the
same object more then once, as the specializations provided below are
used when only one or two template arguments are specified by the
user. When 3 template arguments are specified clearly there is issue
(as long as the user do not use the same type twice, and this is
intended as he should not do it).

template<class T1, class T2>
class Feature<T1,T2,N> : public T1, public T2
{};

template<class T1>
class Feature<T1,N,N> : public T1
{};

template<class T1, class T2>
class Feature<T1,T2,N> : public T1, public T2
{
};

template<class T1>
class Feature<T1,N,N> : public T1
{
};

//Supp for Supports/Implements
class SuppFandG : public Feature<F,G>
{
public:
   double f( double x ) { return 0.0; }
   double g( double x ) { return 1.0; }
};

class SuppFandH : public Feature<F,H>
{
public:
   double f( double x ) { return 0.0; }
   double h( double x ) { return 1.0; }
};

class SuppFandGandH : public Feature<F,G,H>
{
public:
   double f( double x ) { return 0.0; }
   double g( double x ) { return 1.0; }
   double h( double x ) { return 2.0; }
};

Here you're into combinatorial nightmare.

But you probably know that.

Perhaps that is the question.

Yes I know that, how to solve the problem is the question in fact.
Or how to achieve a similar result with another design.

Why not just remove those featureThisAndThat classes?
[...]
What do you want the Feature... classes *for*?

I think it is probably better to explain the specific numerical
problem at hand to clarify why I do need these classes.
This part of the library deals with the generations of samples (or
vector of samples) distributed according to some statistical
distribution.
Now, the way it usually works is that we have a random number
generator which only generates doubles from 0 and 1 from which, using
different algorithms, we obtain samples from arbitrary distributions.

Say I have some code that requires the generation of Exponential and
Gaussian samples.
Then, for the user, it makes perfect sense to be able to work with
objects of type
Rvg<Exponential,Gaussian>* rvgPtr (Rvg stands for Random Variate
Generator) and use it like:

double sample = rvgPtr->gaussian( mu , sigma ); //Where mu and sigma
are doubles
double sample2 = rvgPtr->exponential( mu );

Conceptually we are dealing with a specific random variate generator
(an aggregate of algorithms and one random number generator), so
separating the Gaussian and Exponential "features" is confusing.
Moreover, as in most applications there is the need to work with
multiple distributions, passing around lot of pointers is seriously
inconvenient. The previous version of the library did so and I am
rewriting it for a good reason.
In fact most of the libraries available just use the huge abstract
base class paradigm which supportes everything that may be needed,
just to avoid having to pass around all these pointers.
For instance the GSL (Gnu scientific library) written in C uses only 1
pointer to pass around the random number generator and then defines
functions that generate (according to fixed algorithms selected by
function name) the samples from different distributions. Example

double rng_gaussian_boxmuller( gsl_rng* r , double mu , double
sigma ); //gaussian for gaussian distribution, boxmuller for used
algorithm

I wanted to achieve the same usable sintax while allowing for more
generality (change algorithm transparently from the code that uses it)
but without having to template all my code to use looseley defined
interfaces.

With this approach I still have the problems
1) Feature<F,G> is logically equivalent (for what I want to achieve)
to Feature<G,F> but their types are different.
This can however be solved by some fancy metaprogramming using the MPL
boost library (always "sort" the types), so for simplicity let's
assume this is not a problem.

2) Problem of multiple bases, and I want to avoid virtual inheritance
via virtual bases (performance penalty).

Don't think about performance.

Let the compiler do that.

It's not that it's very much smarter, it may even do plain stupid things, but
whatever it does, by deciding to trust that whatever it does is good enough
*you* will be working smarter. <g>

I am perfectly aware of the principles of writing good/elegant
(correct :P) code first, optimize it later (if needed, but do
profiling before ecc ecc).
In fact I have done this, I have profiled the code from a previous
version of the library, I have found this a critical point where most
of the run-time performance can be lost.
Reason is random number generation has become really fast (see for
instance CUDA on latest Nvidia GPU cards, we talking of hundered
millions random numbers/sec) and you do not want to have these calls
be bottlekneck of your simulations.
So, given the requisite of interface/implementation seperation, I need
to select the most efficient solutions.

This is probably solvable by using directives inside the Feature
specializations.

Still I am not 100% sure I can make this work and it will not scale
well for a large number of features.
In fact the number of elements composing the hierarchi is given by the
binomial coefficient, almost factorial:
F - > F (1)
F,G - > FG, F, G (3)
F,G,H -> FGH, FG, GH, FH, F, G, H (7)

Yes.

I would like to know if there is a solution to the design problem
which involves the following conditions:

1) The code should have a runtime performance equivalent to Example 1.

Then use example 1. After correcting it, of course.

I already explained that passing manually multiple pointers is doable
but inconvenient.

2) I want to be able to specify easily some set of features and being
able to "pass in" any pointers to objects that have this (and usually
extra) functionality.

Huh?

I think we agree that "logically" speaking you can expect something
that is able to do A, B and C to be able to do A and B too right?

3) I want code that depends on features f() and g() not to require re-
compiling whenever I consider a new feature h() somewhere else.

4) I do not want to template everything that want to use such features
(almost all the code). There should be some kind of "separation", see
point 3.

Looking in the (numerical) libraries I have usually found the two
approaches:

1) Define a huge abstract base class B that have f(),g(),h(),......
Problems: whenever I want to add a new feature z(), B has to be
modified, everything needs to be re-compiled (even if this code does
not care about z() at all), all the existing implementations D1,
D2,... of B needs to be modified (usually by having them throw an
exception for z() apart for the new implementation that supports z()).
The solution of enlarging B progressively when I need to add features
is not a good one for the problem at hand, as the featurs f() and g()
are really "as important" as h() and i(), an neither is "more basic"
then the others.

2) Separate all the functionalities and use one pointer for each
functionality.
However, this is cumbersome for the user (in most situations 4 or more
pointers would have to be carried around), and for the problem at hand
this approach is not optimal (here really it is 1 object that may or
may not do something, in fact calling f() will modify the result
obtained by g() and vice-versa).

Thank you in advance for your help.

You're into bad design and should primarily look at that.

That is why I was asking for suggestions.
And I am trying to keep everything as simple as possible (for the
user).

I still think that having to deal with
Rvg<Poisson,Gaussian,Exponential>* is more intuitive (and convenient
and closer tho what is really happening) then three separate pointers.

Given the further insight do you have a design suggestion?

At the moment I am considering working with a container storing the
pointers one for each feature but make this invisible to the user.

Cheers
KRao