Re: standard vs. hand crafted loops

From:

"Andrei Polushin" <polushin@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

16 May 2006 06:07:50 -0400

Message-ID:

<1147732249.210743.233690@j73g2000cwa.googlegroups.com>

Daniel T. wrote:

(a) Is total line count reduced? (Here I refer basically to the number
of semicolons.)
(b) Can a temporary be removed from the mainline function by using the
algorithm/virtual?
(c) Is the cyclomatic complexity of the program reduced?
(d) Is the intent of the code better communicated?

The above are from least important to most important. The point is to
use whichever does the best job, not categorically decide that X is
better than Y in all cases. Of note, "fewer functions" is *not* one of
the deciding conditions I use, others apparently do. How about you?

As I've said in another post, the complexity is the weighted sum of
things you have to deal with at once. Thus you cannot say that "fewer
functions" is not a deciding condition, because when you are dealing
with more functions, you are processing the more complex system.

Now, are you willing to assert that algorithms *never* reduce line
count, *never* help remove temporaries, *never* reduce cyclomatic
complexity, and *never* do a better job of communicating intent?

They reduce, remove, reduce, do. And they introduce other things,
such as more functions, more classes, more places to look for,
more studies, more habits, and more philosophical discussions.

It might be better to compare by example.

Starting from the code of you and the original post, let us have

     #include <vector>
     #include <numeric>
     #define auto vector<MyData>::const_iterator
     using namespace std;
     class MyData {
     public:
        size_t len() const { return 22; };
     };

Now I will write two functions, called "loop" and "algorithm", then
measure their complexity using the sum of criteria.

First, the version with loop:

     size_t loop(const vector<MyData>& vec) {
         size_t l = 0;
         for (auto it = vec.begin(); it < vec.end(); ++it) {
             l += it->len();
         }
         return l;
     }

  +1 the variable is introduced
  +0 the variable name is not contrived: "l" is not a real name
  +2 the loop, bad
  -1 the loop is written using conventional pattern
  +0 operator+= is used, so we should not bother about performance
  +0 all function calls are inlined, no performance loss anyway
  +3 for line count
==5 is the complexity

Next, the version that uses algorithm:

     size_t add_length(size_t lhs, const MyData& rhs) {
        return lhs + rhs.len();
     }

     int algorithm(const vector<MyData>& vec)
     {
        return accumulate(vec.begin(), vec.end(), 0, &add_length);
     }

  +1 the function is introduced
  +1 the function name is contrived: "add_length"
  +1 the function is written separately from the other code
  +0 no loop
  -1 the intent to "accumulate" is stated clearly, good
  +0.5 operator+ might bother you about performance
  +0.5 taking the &add_length might cause it not to be inlined
  +3 for line count
==6 is the complexity

The performance weight may vary depending on your optimizer: it may be
a clincher if you know exactly that your compiler never optimizes the
functions given by address. You may wish to avoid it in portable code.
Thus the task is complex for both humans and compilers, but feel free
to ignore this hard-edged argument for now.

By my calculations, the complexity is not decreased.

Now you say that...

With even moderate reuse, the total line count may be reduced. If
every loop body in your program is different, then the total line count
probably won't go down by using algorithms, but if even a few of them
are the same, you might see significant line count reduction.

Let's try with algorithms:

     size_t add_length_(size_t lhs, const MyData& rhs) {
        return lhs + rhs.len();
     }

     int algorithm1(const vector<MyData>& vec)
     {
        return accumulate(vec.begin(), vec.end(), 0, &add_length);
     }

     int algorithm2(const vector<MyData>& vec)
     {
        return accumulate(vec.begin(), vec.end(), 0, &add_length);
     }

  +6 was the original complexity
  -1 the intent to "accumulate" is stated clearly, good again
  +1 for line count
==6 is the resulting complexity, which is not changed.

In the case of loops, we will use "extract method" refactoring:

     size_t accumulate_by_length(const vector<MyData>& vec, size_t l) {
         for (auto it = vec.begin(); it < vec.end(); ++it) {
             l += it->len();
         }
         return l;
     }

     size_t loop1(const vector<MyData>& vec) {
         return accumulate_by_length(vec, 0);
     }

     size_t loop2(const vector<MyData>& vec) {
         return accumulate_by_length(vec, 0);
     }

  +5 was the original complexity
  +1 the function is introduced, when it became necessary
  +1 the function name is contrived, when it became necessary
  -1 the intent is stated clearly now
  +1 for line count
==7 is the resulting complexity, increased.

Thus the things has been kept simple while the situation was simple,
but they become more complex when the situation become more complex -
isn't it adequate way to deal with things?

By the way, there is nobody told you that algorithms are bad, but the
sum of benefits they provide is not so good in many cases.

                               ***

My personal opinion is that there should be lambda statements
similar to conventional syntax of "if" and "while":

     void for_each_with_lambda(const vector<int>& vec) {
         for_each (int x : vec) {
             std::cout << x << ' ';
         }
     }

     void sort_with_lambda(const vector<int>& vec) {
         sort (int a, int b : vec) {
             yield a < b;
         }
     }

Having that, we can express the sample above this way:

     size_t accumulate_with_lambda(const vector<MyData>& vec)
     {
         return accumulate(size_t l, const MyData& r : vec, 0) {
             yield l + r.length();
         }
     }

  +1 the anonymous function is introduced
  +0 the function is written with the other code
  +0 no loop
  -1 the intent to "accumulate" is stated clearly
  +0.5 operator+ might bother you in some cases
  +0 all function calls are to be inlined
  +2 for line count
==2.5 is the complexity

With that, we will be happy.

--
Andrei Polushin

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]