Re: Values vs. references for deep, nested structures.
On Aug 4, 4:19 pm, AJG <plus....@gmail.com> wrote:
Hi there,
I have a framework of classes that build a tree of nodes in various
combinations. My node type is something like:
struct Node : std::vector<Node> {
// a) Yes, I know inheriting from standard containers isn't recommended.
// b) Yes, I suspect Node is incomplete when passed to vector, but it
// does work in all compilers I've tested.
// c) Yes, I've tried other containers such as deque.
// Please disregard these issues as (I think) they are beside the point.
// ...
};
So as you can see, nodes are held "by value" within the tree. This is
very clean, convenient, and elegant, since you can pass values around
without paying too much attention to memory management. In particular
you can do things like:
typedef std::pair<bool, Node> Result;
Result doSomething() {
Node a, b, c;
b.push_back(c);
a.push_back(b);
return std::make_pair(true, a);
}
The main issue with this approach is the massive amount of copying
that can go on when trees get very big and are being passed around as
values. Indeed, profiling showed that most time was being spent within
the guts of vector.
So I tried using pointers instead. Now, raw pointers would probably be
faster, but they are a pain to manage, so I went with shared_ptr at
least for now. So we would change node to:
struct Node : std::vector<boost::shared_ptr<Node> > {
// ...
static boost::shared_ptr<Node> create() {
return boost::shared_ptr<Node>(new Node);
}
private:
Node() {}
};
The example code is slightly less elegant now:
Result doSomething() {
boost::shared_ptr<Node> a(Node::create());
boost::shared_ptr<Node> b(Node::create());
boost::shared_ptr<Node> c(Node::create());
b->push_back(c);
a->push_back(b);
return std::make_pair(true, a);
}
Rudimentary testing showed that my code does indeed run faster, but
not as much as I expected (probably a 25% or so speedup.) So now I am
kind of undecided about which way to proceed, and I would like to hear
some other opinions.
Some things to consider:
1) Relying on shared_ptr does add a boost/tr1 dependency, which in
some cases is nontrivial.
2) I suspect the speedup is not as great as I imagined, especially in
release mode (optimized), because the compiler is smart enough in a
lot of cases to elide unnecessary copying.
3) How do C++0x's R-value references / move semantics play into all of
this?
4) Any other generic tips for minimizing memory usage and time spent
copying, when using containers?
Thanks :)
Hi AJG,
First of all, I don't think I'm the right person who can answer your
questions although I wish if I could. I would rather ask you one thing
that confuses me.
I know you already mentioned that inheritance or the using an
incompleted data type are not the main point in this post but I'm just
curious about this so please forgive me about this question.
struct Node : std::vector<Node> {
...
};
Doesn't the structure raise some sort of recursive issue during the
compile time? I means Node is "a vector of Node" so when the compiler
instantiates a concrete type for vector<Node>. it will start
evaluating about the type Node which hasn't been completed the
definition yet... I just naively guess that this might cause the Node
class would have unnecessary inheritances...
I've never seen that kind of attempt and I actually really like it =)
Node a, b, c;
b.push_back(c);
a.push_back(b);
The codes above are really beautiful!
Since you started using boost::shared_ptr...
b->push_back(c);
a->push_back(b);
After the two lines of codes you provided, have you tried to add "c-
push_back(a);" at the end?
It is extremely interesting codes =) but sorry that I couldn't answer
to your questions.
Regards,
Alex Kim
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]