Re: Bidirectional iterators: signed distance?

From:

Mark P <usenet@fall2005REMOVE.fastmailCAPS.fm>

Newsgroups:

comp.lang.c++

Date:

Wed, 05 Sep 2007 19:23:56 GMT

Message-ID:

<gnDDi.1813$FO2.1654@newssvr14.news.prodigy.net>

Victor Bazarov wrote:

Mark P wrote:

Victor Bazarov wrote:

nottheartistinquestion@hotmail.com wrote:

As an intellectual exercise, I've implemented an STL-esque List<>
and List<>::Iterator. Now, I would like a signed distance between
two iterators which corresponds to their relative position in the
list. For instance, if I did something like distance(list.end(),
list.begin()), I would get -list.size(). The STL's iterator
distance function amounts to something like this:

distance_type distance(Iterator first, Iterator last)
{
   distance_type n = 0;
   while (first != last) {
     ++first; ++n;
   }
   return n;
}

So, if my list has ten elements, and i4 is an iterator at the 4th
node and i2 is an iterator at the 2nd node, then std::distance(i4,
i2) is 9. That's meaningless to me. What I want is distance(i4, i2) ==
-2. So I implemented distance as a method of List<> like so:

int distance(Iterator i, Iterator j)
{
   Iterator tmp = i;
   int d = 0;
   while (tmp != j) {
       if (tmp == end()) {
           // oops, try the other way
           tmp = i; d = 0;
           while (tmp != j) {
               --tmp; --d;
           }
           return d;
       }
       ++tmp; ++d;
   }
   return d;
}

But this is butt-ugly and I don't like the constant checking for
tmp==end(). Can anyone think of a nicer way to implement this
functionality? Has it been done somewhere else that I can have a
look at?

Find the distance from each of the iterators to the beginning and
then subtract. I.e.

distance(a, b) == distance(begin(), b) - distance(begin(), a)

V

Nothing wrong with this solution but of course the worst case
performance is O(n) where n is the length of the list. You can
achieve O(d), where d is the distance between the two iterators, if
you alternately advance the two iterators until one reaches the
original position of the other. Naturally this won't be as compact
as Victor's solution.

I am by no means the master of the Big-O notation, but it seems that
O(n) and O(d) are the same terms. The worst case scenario with my

> original suggestion is when 'a'=='end()' && 'b'=='end()'. Then the
> number of comparisons is 2*N. The worst case scenario with counting
> in both directions is when 'a'=='end()' && 'b'=='begin()', i.e. the
> distance is -N, in which case the number of comparisons is N, still.
> So, you get about 50% savings on iterator comparisons with going in
> two directions.
>

This is true, but this is simply a reflection of the fact that d can be
as large as n. The flip side of the coin is that d may be much less
than n, which is where my algorithm pays off.

Turning first to the subject of big-Oh, one can be creative with the
terms used within these expressions. I could have "done you a favor"
and declared your algorithm to be O( distance( begin, a) + distance(
begin, b)) which is arguably better than O(n). But while this is
mathematically true, it's probably not very interesting. (Though it
could be if one is working in a regime where one expects the iterators
to typically point to the beginning of the list.)

However it's fairly conventional to specify to complexity in terms of
input size and output size (or value), which is why I draw the
distinction between O(n) and O(d). For example, a good algorithm for
determining all intersection points between n segments in the plane has
complexity O(n lg n + k) where k is the number of intersection points.
Yet if one only references the input size, one would have to quote the
complexity as O(n^2) since, in the worst case, there can be n^2
intersections between n segments. But this obscures the fact that the
algorithm is in fact much more clever than simple brute force pairwise
testing of all segments, which is also O(n^2).

Still your question underscores an important point, namely that what
makes an algorithm good depends very much on the context in which it
will be applied (which in turn dictates how one should characterize the
complexity).

Returning to the linked lists, here's another way to look at it.
Observe that my algorithm "dominates" yours in the sense that, for all
inputs, my run time is always less than a constant multiple of your run
time. However your run time is not always less than *any* constant
multiple of my run time. This is not surprising since |d| <= n and d
may be smaller than n by an arbitrarily large factor.

-Mark