Re: Lambda allocations
mark.st.lee@gmail.com (rgba32) wrote (abridged):
Lambdas are a new-ish topic for me.
Me too. I was slightly involved very early on, but lost track for a few
years. Only recently have I started using a compiler that supported them.
As I dive in an learn more about them, it occurred to me that
there may be some "hidden" memory allocations happening to account
for the cases where a closure and it's state need to outlive the
scope it was created in.
My current understanding is that that isn't lambdas, that's
tr1::function<>. They are often used together.
A lambda expression is a convenient way of writing a class. The type of
the expression is the type of the class, and only known to the compiler.
The class doesn't inherit from anything and doesn't offer any (run-time)
polymorphism. This means that at low level, lambdas are used with
templates or auto in ways that capture the full type. For example:
auto x = []( int z ) { return z+1; }
auto y = []( int z ) { return z+1; }
here x and y are different objects with different types. And because of
this, they don't need dynamic allocation. The size of a lambda is always
known at compile time and so can be allocated on the stack. If it
captures variables by value or by reference, the requisite size is also
known at compile-time so it still doesn't need heap allocation.
If you want to use lambdas dynamically, then you should probably use
tr1::function<>. Consider:
template<Fn>
int sum_s( int *first, int *last, Fn fn ) {
int sum = 0;
for (int i = first; i != last; ++i)
sum += fn( *i );
return sum;
}
int sum_d( int *first, int *last, tr1::function<void(int*)> fn ) {
int sum = 0;
for (int i = first; i != last; ++i)
sum += fn( *i );
return sum;
}
Both of these can be invoked with a lambda as the third argument. The
first uses static polymorphism, meaning each instantiation will make a
new specialised version of the function. That version can easily be
optimised by the compiler, to the point where we might reasonably hope
that it produces the same machine code as a hand-written for-loop at the
point of call. No heap allocations.
The second uses dynamic polymorphism, so all calls share the same code.
And that code is general, not specialised for its argument, and knows
very little about the lambda you pass to it; in particular, it doesn't
know its size. Therefore, using tr1::function<> can imply dynamic
allocation. It is, conceptually, less efficient than a lambda on its own.
In practice many libraries avoid the heap allocation by including some
buffer space in the size of tr1::function<> that can be used instead of
the heap. I had thought VC10's library did this. Older libraries didn't.
It takes a certain amount of thought to implement tr1::function<> at all,
and even more thought to implement it efficiently. (The obvious
implementation is to wrap the lambda with a templated class that derives
from a base class with a virtual function. Then the caller of
tr1::function<>() only needs to know about the base class. Some libraries
try to do better.)
The upshot is that you can do a lot more with tr1::function<> than with
lambdas. For example, it doesn't make sense to have an array of lambdas,
but arrays of tr1::function<> are both possible and useful. If you need
to write a function that returns a lambda, you will find it easier if it
returns a tr1::function<> instead. (I'm not even sure if the former is
possible because of the difficulty of naming the lambda's type.)
On the other hand, because lambdas don't do any of that dynamic stuff,
they are conceptually more efficient. You just need to be sure that the
code which accepts the lambda is written statically with templates, like
sum_s, rather than using tr1::function<>, like sum_d.
-- Dave Harris, Nottingham, UK.
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]