Re: Job Interview, Did I Mess Up?

From:

"Daniel T." <daniel_t@earthlink.net>

Newsgroups:

comp.lang.c++.moderated

Date:

Thu, 4 Mar 2010 14:08:50 CST

Message-ID:

<daniel_t-B5CFF3.10271704032010@70-3-168-216.pools.spcsdns.net>

"Balog Pal" <pasa@lib.hu> wrote:

"Daniel T." <daniel_t@earthlink.net>

"Balog Pal" <pasa@lib.hu> wrote:

"Daniel T." <daniel_t@earthlink.net>

Balog Pal and I have a different idea of what constitutes DRY in
this case.

That should be weird...

It should be weird that programmers disagree about how to write
code? I think not.

No, that is pretty common. ;-) (And before we switch context to the
theory field please note my first comment on the first example, that
it is not a big deal -- when the item is such small, there is much
more leeway.)

But interpreting what "repeat" or "single" means should (IMO) not be
different. I.e if you state, that you like your solution better, and
don't care to apply DRY for this particular case, it is okay for me.
But stating it fits DRY, is not.

I think that having the extra variables creates two
representations for the piece of knowledge in question

There are no extra 'variables' but constants.

(tolower(a) and lo_a both represent the exact same thing.)

Huh? One is a transforming function and the other is a *result*.
The piece of code is supposed to work on that result.

IMHO, his solution breaks DRY.

By *not* repeating the calls that is? Please explain in details
how that happens. And how you get to the single point of truth by
pasting the algorythm.

I'm not pasting an algorithm, I'm calling a function to obtain a
property of a type, and calling a function more than once doesn't
break DRY.

tolower is IMO not a property and not of a type -- it is a mapping
function. The logic has two places to use th result of that
conversion. Those places are connected to each other, so should use
the same thing instead of a repeated call.

With so little code the difference may not be evident, but try to
imagine some evolution -- if some aspect changes in coding or
requirements, how many places need a change.

The DRY code philosophy is stated as "Every piece of knowledge must
have a single, unambiguous, authoritative representation within a
system."

So which is the "single, unambiguous, authoritative representation"
in this case, tolower(a) or lo_a? Whichever answer you give, you are
breaking DRY by having the other hanging around.

I don't get your reasoning. We have the original input character in
'a' and nowhere else. We then have 'lower cased a' in lo_a. And
nowhere else. (the content of 'a' might be lowercase by some accident
but that does not matter, it is still a different thing.)

Storing the result in a variable is a performance optimization, not
a violation of DRY (IMHO a premature and unnecessary performance
optimization in this case.)

tolower() being a pure and constexpr function seem to confusing the
picture. Imagine it is more abstract, and we needed to call some
function in its place, that may do unknown things in the
implementation.

In that case, I guess, wanting to call it twice would not be so
popular. And most people would prefer my formulation.

Since you said you don't get my reasoning, I will try again (hopefully
without repeating myself too much :-) In your example, you have two
constructs, "tolower(a)" and "lo_a", and the two of them are completely
interchangeable. Surely you see that. This interchangability did not
come about by accident, they are interchangeable by design, they both
represent the same thing. The fact that tolower() is a "pure and
constexpr" function isn't confusing the picture, it is a big part of the
reason that the two constructs represent the same thing.

However you make an important point. If it were the case that
"tolower(a)" represented something other than simply "lower case a", if
tolower() had side effects for example, then we would be calling the
function, in part, to cause those side effects. If that were the case,
then tolower(a) and lo_a would *not* represent the same thing and you
wouldn't be violating DRY, that isn't the case though.

I am pretty dogmatic about the single exit principle.

That would gain you another set of bad marks for in a good C++
shop. SESE works fine for C, not as dogma but as a poor mitigation
to lack of RAII. In C++ you shall use RAII fr anything, and SESE
is nothing but ilusion anyway as you expect exceptions thrown in
the code below the actual code being 'transparent'.

And beyond an old dogma grown in a different context, SESE has no
benefits only the drawbacks. Read Alexandrescu for more details.

Single exit supports DRY.

Hm? Those are orthogonal things, better not mixed.

With only one return there is only one (authoritative) place that
the postcondition of the block of code must be checked.

As you can't check anything in the finction after the return it hardly
has any value -- and if you want to check the postcondition it can be
done at a caller site (as normally done in unit tests) that care
nothing about the number of exits.

Also, if you want to know the postcondition by just reading the code
(review), it is not at all easier with single exit for too many cases.
Especially if that postcondition have several cases by its nature.
Please take the time to look up AA's examples.

One classic example is search a thing in a collection. SEME solution
returns right where it is found and have another exit where not found.
The SESE version uses a framework of flags or other obfuscation
measures. It came up in countless many debates and I still waiting tor
a reasonable explanation how it that good. And especially easier to
read or verify.

I don't have multiple exits out of the middle of 'if' statements or
'while' loops, why should I have them in the middle of functions?

As I said the normal way is to exit right where you have reached the
postcondition not a mement earlier or later. It may be a single point,
or two or whatever many, and they can happen to be inside an if or
while -- why on earth not?

And if they are naturally there, making the execution continue, and
smuggle some results to a single point only increase the complexity
and adds opportunity to messing up instead of reducing it.

That said, many people have no problem with using break or continue
to jump out of the middle of a block of code and maybe you are one
of them.

Of course. :) And I could add that I'm not afraid of using goto either
if (if!!!) that comes up as the best solution for the case -- just it
is damn rare in C++.

Believe me, being dogmatic is a bad thing. To go a little back to
originals -- If you were on my interview, a statement like you did,
being dogmatic SESE would map to an automatic "no hire" (after
clarifying you actually mean it). Because I stick to wizard rule #6 --
my only sovereign is REASON. Dogmaitsm is the very opposite of reason.
I expect programmers to always be aware what they do, and be able to
provide a rationale -- fit for the particular case. Just as guidelines
(any guidelines) are only as good as they are understood -- and
wielder can regress them to roots. (See also "the five whys" from
lean.)

I consider it a style issue more than anything else. Some people
have no problem exiting blocks of code all over the place, I try not
to.

You skipped over the my mentioning exceptions. As exceptions can
happen, it is no longer just a style, and if one thinks some layout
*has* the actual SESE property just by restricting to a single return
-- is quite mislead. Thus imposing danger.

Here too, I will make another attempt at explaining myself. You have
brought up several points and I will attempt to cover them all, but the
last item is IMHO the most importnat so I will cover it first.

_Exceptions to the Rule_

Of course there are exceptions to every rule (including this one. :-) As
programmers we have lots of different design goals and many of them call
for mutually opposing solutions, coding is often about trade-offs. It is
not wrong of me to avoid multiple exits when there are no important
design or performance constraints requiring them. To put it another way,
there is no design rule that says "prefer multiple exits from blocks of
code," yet that is what you seem to be advocating. (If I am wrong here,
then we are probably not that far apart in our positions.)

I can see where this issue might have stemmed from the use of the word
"dogmatic." I said "pretty dogmatic" and you might have thought that I
meant that I insist on single returns no matter what the situation. That
is not exactly what I meant, rather I meant that I am more willing to
expend the extra effort to find a single return solution than most other
programmers.

Hopefully the above clears up this issue, but if not then continue
reading...

_Classic Example, Finding Something in a Collection_

Your "classic example" is IMHO a very good one. When we search for a
thing in a collection, should we prefer multiple exits? I think not.
This means approaching the problem in a different way, but it doesn't
necessarily entail what you called a "framework of flags or other
obfuscation measures." If we approach the problem from the assumption
that multiple exits are to be preferred, then we might write it like
this:

template< typename InIt, typename Tp>
InIt find(InIt first, InIt last, const Tp val) {
    for (InIt it = first ; it != last; ++it)
       if (*it == val)
          return it;
    return last;
}

If we try to avoid multiple returns we might end up with something like
this (cribbed from GCC's library):

template< typename InIt, typename Tp>
InIt find(InIt first, InIt last, const Tp val) {
    while (first != last && !(*first == val))
       ++first;
    return first;
}

Now you probably would not write the multiple return example exactly as
I did, but however you write it, I don't think that the multiple return
example should be preferred as a design rule. I am even willing to add a
"result" variable to avoid multiple returns, but I'm not going to throw
in a bunch of flags, that would be silly.

(As a side note, the GCC library has a separate 'find' for use with
random access iterators that does use multiple returns and looks
remarkably like a somewhat cleaner version of Duff's Device. This is
obviously a performance optimization, which is the one thing that trumps
design near the end of a product's construction.)

_Postconditions and Where to Check For Them_

Again we come back to not repeating ourselves. If the post condition of
a function can be checked by code, then it should be. I think we can
agree on that, but where should we do the checking? Your suggestion was
that this checking should be done "at a caller site (as normally done in
unit tests)."

Most functions are complex enough that they require multiple tests, so
they are generally called from multiple sites. If we want to put the
postcondition checking code in only one place (so we don't repeat
ourselves,) then the obvious place to put it is just before the return
within the function. If there is more than one return, then we still
must repeat ourselves; which is the one thing that we have both agreed
we should avoid.

Hopefully this post will clear up my thinking about these issues and
cover any concerns or confusion you may have.

--
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated. First time posters: Do this! ]