Re: Multiple implementations of the same algorithm

From:

luca <luca.frammolini@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Thu, 11 Aug 2011 16:01:23 CST

Message-ID:

<7a7c9ed6-9e40-4149-9033-e5b514fa27e1@f8g2000yqb.googlegroups.com>

Hi Daniel,

thank you for your reply!

On Aug 11, 11:03 am, Daniel Kr?gler <daniel.krueg...@googlemail.com>
wrote:

On 2011-08-10 22:15, luca wrote:

correct my analysis if i am wrong, please. I am writing a software in
the computer vision field. Now, i have several algorithms that can be
regarded as low-level and could be viewed as a single routine.

This description confuses me a bit. Do you mean that there exist some
functions alg_1(), alg_2(), ... that solve parts of an algorithmic
problem and that there is actually only one public function alg() that
is implemented in terms of above alg_i() similar to the template pattern
of the classic pattern book?

Yes. The "function" alg() is modeled as a class and will have more
complex behaviour of
a low-level algorithm. I have several of this classes, each one uses
one or more of those low-level
algorithms.

Is it a
wrong attitude to write a simple class with a single method that do
the job? Should i use a static method of a namespace or a class
containing static low-level routines? Consider that sometimes it could
be useful to create an object, because memory allocation could be done
only at construction (or better, initialization) time (and i could
reuse the object untile its destruction). The burden of handling such
memory would be demanded to the object itself, instead of having to
manage memory to pass to a static method.

I find it hard to provide a universal answer for this very general
question. "It depends" will fit, but does not help you much ;-) Usually
I tend to define a set of state-free functions and to minimize the
amount of necessary state. If still state is required it has to be
decided whether it can remain locally within a function or requires a
separate object. Often static member functions of an algorithmic class
type can help, if the class type needs to be defined in public headers.
If I can define my function in a completely hidden translation unit,
free functions with static linkage or within unnamed namespaces often
suffice to prevent accidental usage from "the outside".

This is something that always bother me, coming from C. I mean,
suppose
that you have a function that:

allocate X bytes of memory
process input parameters, using X like a buffer
de-allocate the X bytes and returns the output

Is this a state-free function? If this function is called often (and i
am talking
of real-time applications), and X is always the same (but decided at
run-time), i see two ways of avoiding the memory
allocation at every call:

1. the caller has to provide the memory buffer
2. i create a little class that allocates its own memory at
construction time and de-allocate it at destruction time.

Which method is better (more often used by professionals)?

Another question i have is the following: suppose i have three
different implementation of the same algorithm. One is written in
plain C++, one is written in OpenCL (for parallel processing, would
result in different C++ code, with special function to be called,
kernel programs that run on the device to be compiled and so on), one
is written in playin vertex/program shaders for GPU (call it GPU
implementation, it would use OpenGL for example). The algorithm do the
same job (for example, processing an image and returnings the a list
of coordinates of special pixels), but the method to call it and the
way it returns data can be very different.

You need to define a generally valid calling signature that can be used
for all implementations, otherwise you would break user-code when
switching between different implementations.

That is the hard part! I mean, i know how to implement the low-level
algorithms in C++, but i still don't know much of OpenCL (the other
way).
I have looked at the relevant documentation and basically i am ready
to find a way to
design small and little interfaces for switching between C++/OpenCL
without
too much problems (i think). In this way, if a platform support OpenCL
and the library
is compiled with OpenCL macro on, the switch can be made at run-time.
Otherwise, one should be able to compile the software without OpenCL
(just plain C++).

Consider that i want the client of the software to be able to:

1. use the high-level classes
2. use the low-level algorithms directly

Maybe i should drop 2., since the primary purpose is 1., but for
flexibility reasons (and number of features provided ;)
2. would be great.
Otherwise, i could just implement the low-level algorthms without
using a unifyed calling method. I could use the template pattern
to implement the high-level classes, handling variations in this way.
No interfaces for input/output parameters.

All this mess because of OpenCL, that is not still implemented for
some systems! Argh!!

The problem is the
following:

1.
I will drop C++ implementation of the low-level algorithms in future,
but for now, OpenCL is not so widespread, and i am forced to implement
things in different ways.

2.
C++ plain implementation results in code to be executed on the CPU.
Memory is passed/returned as usual (through pointers).

3.
OpenCL/GPU implementation: although they are different under every
aspect, share a common thing: memory must be transferred from main
system
to device memory for input parameters and the contrary for output
parameters.

So, 2 and 3 call for some sort of interface for input/output
parameters. Even the format of this parameters is important. For
example, a graph implemented using pointers would be not suited to be
passed to the device memory. But this aspect would be decided in
advance (whatever is the implementation of the low-level algorithm,
the format of the input parameters must be such that no pointers are
valid and so on...), so no problem.

Try to define some lightweight wrapper types that can hide these
differences from the pov of the public caller. The actual members of
these wrapper types would typically be references to the
implementation-specific data types.

But, one thing to keep in mind and which is important is the
following: low-level algorithms usually are called one after another,
that is, one run A1, followed by A2, followed by A3 and so on. Now, it
can happen that A_{i+1} uses the results provided by A_i and it would
be a waste of time to transfer memory from device to main system at
the end of A_i: just tell A_{i+1} that the parameter needed is already
on the device memory and has handle XXX.

Ok, it is clear that some form of interface for input/output
parameters is needed. This object would have a method Create( void
*memory, etc...) and would handle all of this.

Maybe.

So, i have an interface for memory object to be used as input/output
parameters for low-level algorithsm. Ok, but how would you create this
algorithms?
I mean, mixing CPU with GPU implementation would be a stupid thing to
do. OpenCL uses the concept of Context. A natural way would be to
create an interface IContext and having diffrerent context for
different implementation (CPU, OPENCL, OPENGL for example). Then i
would implement a factory for low-level algorithms. This factory
would create IAlgorithm objects (another interface, but could be a
void *) given:

1. the name of the algorithm (for example: ALGO_EDGEDETECTOR_SOBEL)
2. the actual context

Finally, who construct the context? The client of the software will
use a static method, for example:

IContext *ctx = ContextFactory::CreateContext( CTX_OPENCL );

Yes, the factory pattern is often used in this kind of use-case.

So, to sum up:

. a context concept, IContext, created by a context factory
. a way of handling input/output for low-level algorithm
(IMemoryObject could be the basic interafce)
. a factory for classes implementing low-level algorithms

Is this general desing a mess? What do you think? How would you handle
this situation?

I'm not sure about your concept of a IMemoryObject (isn't this an
allocator?), but generally the approach makes sense. It may also make
sense to use the bridge pattern to encapsulate the factory/multiple
implementation details, because I understood your description that
replacing the algorithm implementation is in your hands and should not
be a public customization point.

Could you explain in much detail where would you use the Brigde
pattern?
Luca

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]