Re: Linearity of the STL containers and iterators (was: Re: std::string bad design????)

From:

"Le Chaud Lapin" <jaibuduvin@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

31 Dec 2006 20:59:33 -0500

Message-ID:

<1167596888.309229.174400@h40g2000cwb.googlegroups.com>

Seungbeom Kim wrote:

Le Chaud Lapin wrote:

You should try the iterators-state-in-the-container model. Everyone I
have met who has experience the pleasure of not having to create an
iterator to go with a container has raved about it, from students to
colleagues. It's addictive, and especially attractive when you are
iterating over a container that is 3-levels deep inside nested
containers.

So, what is it exactly, and what benefit does it give?

It relieves the programmer from the tedium of having to define
iterators for deeply-nested containers. Here is an ad-hoc data
structure for which I would not have to (nor would want to) define
iterators for:

AVL::Associative_Set<Gate_Tree::Path,
Red_Black::Associative_Set<String, AVL_Associative_Set<Instant,
Estimator> > > foo;

Suppose you know that the innermost Associative_Set maps an Instant
that you have in hand, and you want the corresponding Estimator. All
you wold have to do is:

foo.locate(x).locate(y).locate(z)......RHE();

Containers don't have to be linear, even though the current standard
library offers linear containers only. The STL style doesn't preclude
more complex, non-linear containers such as the graphs provided by the
Boost Graph Library.

I guess I misuse the term linear. When I read Stroustrup's book, I got
the impression that there was an implicit assumption that most of the
containers has some very basic things in common, with a few variations,
and if there were a container that had little to nothing in common with
the other containers, then it was not a container at all, and therefore
should not be included, as the idea of being able to mix algorithms
with containers would break down. Then I started wondering exactly
what was meant by the word "container", and then, after having to make
my own containers for dynamic programming and graph theoretic
algorithms, I discovered that the computer science community has done
very well with sets and lists of various kinds, but left the more
would-be-standard esoteric containers for ad-hoc implementation.

However, I believe iterators have to be linear, because every processing
is inherently linear at some level (without considering concurrency, of
course). In the sense that even if you do a breadth-first search on a
graph, you cannot visit two successors of a node at the same time but
you should visit one first and the other later. That can certainly
modelled as linear.

And don't forget that one container can provide many different kinds

iterators; you can have iterators for preorder traversal and postorder
traversal on a tree, for example.

template <typename T>
struct tree
{
T value;
std::vector< tree<T> > children; // or any other container
};
// I think it is probably an error to instantiate std::vector with
// an incomplete type, even though Comeau Online doesn't complain,
// but it shows the idea.

It is not a a syntax error, but, IMO, a conceptual one. We had this
conversation about 2 months ago with another poster, who finally
admitted that it does not work. I would never write code like that on
principle, simply because a tree cannot contain itself (in an abstract
sense), and that usage, IMO, abuses that principle. Yes, it is
syntactically acceptable on almost all major compilers due to the way
vector is defined internally. The problem happens when you have a tree
with 100,000 nodes, and you want to know how many items are in the
tree. You quickly realize that, unless you want to recursively count
the items each time on each invocation of the member function
population(), you will need an auxiliary variable in your tree, called
population, perhaps of type unsigned int. Once you realize this, all
that curiously-recurring template pattern stuff breaks down unless you
allow a dummy population field in every node except the root.

All functionality that can be added to this and is useful in general is
probably the preorder/postorder traversal iterators. It could indeed
have been done, without showing any defect in the design of STL (which I
don't believe there to be), but it hasn't, for some reason I don't know.
Maybe just the lack of time and/or consensus for defining "the standard"
tree; one could argue that the inorder traversal is useful, but it is
only defined in binary trees, so we should have another template class
binary_tree, and so on. The point is that it's not because of any defect
in the design of STL, it is possible, and you can.

I have what others would call a tree<>. I call mine Monarchy
(single-root) and Polyarchy(multi-root). O(N) is what one would
expect, mostly O(logn). There is family of them, perhaps 12-15 classes
(don't have access to code now). The member function set is rich.

Sometimes I wonder if people even know that a "tree" can easily be had.
It is a very simple concept. Even a 5-year-old understands the notion
of a hierarchy. And any taxonomist would just assume that computer
sciences work with trees. But we do not have one in STL, even though we
use file systems every day.

But again, if I were to get in a room with a bunch of C++ programmers
and computer scientist, the first order of business would be to declare
that the word "tree" cannot be be used during the discussion to design
that thing which we would normally call a "tree", so as to eliminate an
confusion that could arise.

-Le Chaud Lapin

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]