Re: Andrei's "iterators must go" presentation

From:

Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org>

Newsgroups:

comp.lang.c++.moderated

Date:

Thu, 21 May 2009 13:52:18 CST

Message-ID:

<KK03AE.1K09@beaver.cs.washington.edu>

Peter Dimov wrote:

On May 18, 5:43 pm, brang...@cix.co.uk (Dave Harris) wrote:

pdi...@gmail.com (Peter Dimov) wrote (abridged):

The slides were interesting, but left me with a question. std::find
returns an iterator, let's call it 'middle', from which one can
construct both [begin, middle) and [middle, end). But the
range-based find returns the second range, and I see no way
(from what has been shown in the slides) to derive [begin, middle)
from it.

We'd need functions to implement an algebra of ranges.

This is, more or less, my point; that the necessary algebra was not
shown in the slides.

Concatenating ranges is easy. Taking the first N items of a range is also

easy; [...]

Not if you stick to the range concepts in the slides (you can, in
principle, derive the first N items of a bidirectional multipass range
but it doesn't feel quite right).

The way to derive the first N items of a range is by creating a new kind
of range (called Take in D's std.algorithm) that takes another range and
a number. The Take range lazily iterates its host range until the number
gets exhausted. There's a convenience function called take() that gives
you type deduction. So you can write:

foreach (e; take(40, uniform(0, 100))
{
... use e ...
}

to get 40 random integers numbers in [0, 100). Lazy iteration is key for
composing infinite ranges, because obviously random numbers will never
finish so any eager operation on them will go on forever.

There's one interesting detail about Take alluded to in my slides. If
the host range offers a size() operation (length in D lingo), the Take
range also offers a size() operation which is obviously min(n,
host.size()). If the host range is infinite (a primitive concept in D's
ranges), then again Take has a size() computed as n. Finally, if the
range is finite but doesn't know its length, Take also doesn't know its
length. I think this is one pretty cool example on how range categories
interact.

If you have operator- for ranges, so that you can derive [begin,
middle) as [begin, end) - [middle, end), then taking the first N items
is also easy: R - advance(R, N). (It still requires a multipass range
though.)

I gave this some more thought and figured what the right design is.
find() must return the right-hand range, any other design would be too
limiting. However sometimes the left-hand range is needed, and that is
best done with an until function. until lazily iterates its input range
until it finds the sought object. So then you can iterate through a socket:

foreach (char c; until(socket, '\n')) { ... }

That loop will read a line. If you just want to copy that line, you'd write:

copy(until(socket, '\n'), appender(&someString));

(This hasn't been implemented yet, don't look it up.)

A function that returns [begin, middle) when passed [begin, end) and
[middle, end) seems harder to write in a generic way.

One can think of ranges ending at 'end' as isomorphic to iterators
from which 'end' is reachable.

I think they're isomorphic to iterators from which 'end' is reachable
and distinguishable.

[begin, end) is isomorphic to 'begin',
[middle, end) is isomorphic to 'middle', and the iterator pair <begin,
middle> can be said to be isomorphic to the range pair < [begin, end),
[middle, end) >. Which can now be convinced to model the forward range
concept... but not the bidirectional range concept because [middle,
end) cannot grow into [begin, end).

Is that a problem?

Not really, I was just wondering how the algebraic closure looks like,
as the concepts in the slides do not, in my opinion, match iterators
in expressive power.

Any higher-level abstraction will by necessity disable some of the
functionality that can be achieved with lower-level abstractions. The
key is to gain enough from the better abstraction to justify the lost
expressive power. After extensive experience with both iterators and
ranges, I can say that indeed there is loss (e.g. "I want a compact
index storing iterators in collections without random access") but that
is overwhelmed by the huge gains in power, simplicity, and generality.
I've implemented the STL with iterators and then a large superset of it
with ranges, in a much shorter time and with enormous gains in
generality. In wake of that experience, to me there's no looking back at
iterators.

Andrei

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]