Re: C++ Frequently Questioned Answers

From:

Alex Shulgin <alex.shulgin@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Tue, 6 Nov 2007 16:38:37 CST

Message-ID:

<1194380975.155790.143280@q3g2000prf.googlegroups.com>

Yossi Kreinin wrote:

As to the "interface is easy / implementation is hard" issue - the FQA
gives examples where the interface is easy and the implementation is
hard.

Sorry, I haven't stated my point clear enough: there are cases where
designing an interface is harder (or is of comparable complexity) than
providing implementation, and from my experience they are the most
common cases. Of course, there are simple cases where it is not true
and your FQA shows some of them.

As to real world example: suppose you need to design an abstract class
(or class hierarchy) to provide a file system API. I'd expect
concrete methods to be easier to implement since they would be most
probably unrelated to each other (or related only weakly), whereas the
interface(s) are sort of a bigger picture issue--you have to think
thoroughly how things interact and influence each other in order to
design it properly. Add the need to support various different real
world file systems (FAT, ext2, NTFS, etc.) plus yet-to-be-known
systems and the high costs of later changes and you'll see the result.

"C++ is extremely unsafe because every pointer can be used to modify
every piece of memory from any point in code."

An example (i.e. some real code) would be very helpful.

Consider this snippet (I use iterators to skip the part where we talk
about the problems of C arrays, etc.):

template<class Iter> void inc_range(Iter b, Iter e) {
  for(; b!=e; ++b) {
    *b = *b + 1;
  }
}

Now, b & e could have been obtained with vec.begin() and vec.end(),
and vec could have been resized or cleared since then.

Please stop. I hope you are aware we are talking about undefined
behaviour here? If it's all about complaining C++ is too low level
("unmanaged") then I do not see you point--isn't this anyone would
expect?

inc_range()
will modify objects which happen to be allocated where the vector
storage used to be (or it may crash the program, but the first
possibility is the harder one to debug).

[snip]

(this can happen in C++ with the
example in the next link; this is one case where I'd easily agree that
"seasoned" C++ programmers won't write this code - it's more of a
beginner's pitfall):

http://yosefk.com/c++fqa/web-vs-c++.html#misfeature-1

UB again.

In theory, a program that does things like out-of-bounds access is
malformed, so we don't care about it; in practice, /all/ software
shipped to customers or otherwise released to the world is malformed
(think about browsers and operating systems; everything I saw in this
department crashed and had security holes). And you have to debug the
problems, which is much, much easier when a program halts upon the
first violation of language rules compared to the case when it keeps
running and reading/modifying the wrong data, covers its traces by
deleting the objects involved in the error and possibly never crashes.

Yep. But in practice, crashes are not the only bugs that creep into
shipped programs and you have to debug them too.

And of course, I'd expect any seasoned programmer to use appropriate
tool for the job, i.e. do not go too low-level where it's
unnecessary. However, if you need to go low-level what language will
you use--C? How is it any different from C++ with respect to crashes
due to UB?

"C++ is extremely unsafe..." as well as C. And some tasks could be
done only in these languages today. So what's your point--do not go
too low-level if unnecessary? Thanks, I believe most of us have
learned this very well already.

Agreed, except for the "most" part... How do you know they really
are? ;-)

You mean "most" in "most classes are 'straight' C++ classes without
pimpls/ABCs/incomplete types involved"? Well, if they aren't, this
sort of proves my point about the problems with "straight" C++
classes... Of course I don't have numeric data here, either :) Do you
really think it's wrong though?

Nope. I Just suggest using "I believe", "from my experience", "IMHO"
or some other hint to point out that it's your opinion and not common
knowledge or universal truth. ;-)

http://yosefk.com/c++fqa/defective.html#defect-4

Personally, I don't having recall that sort of problems... But may be
you can clarify on what exactly makes it harder as compared to C
dumps? I can think of inlining and templates...

[snip]

But I meant another thing in that item - you don't know how objects
will look like, from the layout of classes with virtual functions to
the memory layout of std::map. Some debuggers will know how to display
those - unless too much memory is corrupted, in which point you have
to kick in. In C, a (custom) hashtable would look practically the same
in the memory of all targets; the standard C++ types look different
everywhere.

Anyway that's quite
natural and expectable, since the language itself is way more complex
than C.

Ah, but that's my point. If the language is "managed" or mostly safe,
then I don't care about the complexity of internal representations
that much, since I don't get to shovel through them. If it's unsafe
though, I'd rather have it simple enough for me to understand what the
pieces mean - I mean the little pieces to which programs will
invariably break into from time to time :)

Point taken, but it's likely to be more closely related to the quality
of implementation: for example, VS2005 'knows' about standard
containers and show their contents in the debugger in a readable way,
and VS2003 doesn't.

Well, it was never a problem for me, and I wish I'd never have this
one. Is sticking to a single compiler a problem to anyone?

Of course it is - sometimes you have compiled third-party libraries,
and sometimes you ship your own libraries to someone using a compiler
which won't compile your C++ code at the front-end level

Point taken, however, with the source code at hand it transforms into
portability issue (see below). And no one forces you to use
proprietary third-party libraries, hopefully... :-)

[snip]

Sometimes this is "not a problem", except you have to deliver your own
libraries compiled and tested with 3 compilers, each with its own
front-end (grammar) and back-end (codegen) bugs.

So this is more of a portability issue than the real need to link
against code produced with another compiler.

What about when you fill your vector with objects obtained by calling
a method returning a const reference in a loop?

As others have noted it would be extremely nice to have an example for
this one. Let's get to some real code already. :-)

Here's real code from a social network back-end :)

class Lamer { public: const LameComments& getLameComments(); };

LOL. I hope there is nothing personal, however, could you please quit
that kind of attitude? To this point you have done more than enough
to "prevent people from falling asleep".

void getTheMostLameComments(
  vector<const LameComments*>& comments, //which type would you use?
  const vector<Lamer>& lamers
)
{
  //or we could use iterators...
  for(int i=0; i<(int)lamers.size(); ++i)
  {
    const LameComments& lc = lamers[i].getLameComments();
    if(lc.areTotallyPointless()) {
      //lamers write lots of comments, better copy
      //by reference... Has to be const - can't modify
      //the lamer's precious comments, and we can't
      //have vectors of references, so it's either a dumb
      //const pointer or some non-standard smart pointer
      comments.push_back(&lc);
    }
  }
}

What's the problem with copying `LameComments' objects? Pardon me my
indent style:

void
getTheMostLameComments(vector< LameComments >& comments,
     vector< Lamer > const& lamers)
{
     for (size_t i = 0; i < lamers.size(); ++i)
     {
         LameComments const& lc = lamers[i].getLameComments();
         if (lc.areTotallyPointless())
             comments.push_back(lc);
     }
}

This way you do not need `const*' _and_ have less error-prone code.
Please note, that in your code the caller must ensure that:

1. Lifetime of `comments' vector does not exceed the `lamers' one.
2. Lifetime of every element of `comments' vector does not exceed the
lifetime of a corresponding `lamer' object for which the LameComments
were obtained.

I think this is a pretty common pattern with code massaging data
structures with even minor levels of nesting/indirection.

I personally do not think so--I'd use a function template anyway.

What if at some point you decide to change the container type from
`vector' to `list'? What if you need to work with both types of
containers? See, here you store Lamers in vector, and there--in
list. Would you end up copying elements from list to vector and then
calling the function?

What about when you need a subset of objects kept in std::vector or
std::list, and the owning collection is about to be deallocated,
because you only need some of the objects, but not all of them
anymore?

http://yosefk.com/c++fqa/dtor.html#fqa-11.1

Another example please? From what I can see it's all about storing
raw pointers in the containers... very suspicious.

Do you know how much time the previous real example took me to invent?

So probably this is a sign of how much that example is far from the
real-world? ;-)

What do you want me to do - grab a complete snapshot of an app I
worked on with memory management bugs in it and post it to Usenet? :)

Of course not, however, you can boil your code down to the essential
part.

What's wrong with the "English" example? You iterate over the nodes of
a 3D model, wishing to carve out the interesting ones and dispose of
the model. With garbage collection, the unused parts of the model
would become garbage; with RAII, you'd have to ask the model to
"forget" that it owns the objects. Some "owners" (like std::list) can
do it, some can't. Or you could use reference counting; simple B-rep
3D models are not unlikely to have cyclic references in them.

As for me, when it comes to code, it's much easier to spot weak points
and propose alternatives. See, English is not my native language (nor
is C++, however).

From what I can tell--again what's wrong with simply copying

interesting nodes somewhere and then destroying the original model?

I've written a (lame) PE executable parser in D a couple of days ago,
where you carve out sections of the byte array; in D, you do it with
slicing, and don't think about the life cycles. With std::vector, you
have to explicitly keep the full vector around so that the references
to sub-chunks won't become dangling references, and you can't use
std::vector to represent the sub-chunks - you need a different,
custom, non-owning container; or you have to copy from the large
vector into smaller ones. It's a really tame example in terms of data
structure complexity, the only good thing about it is that it is real
and I might publish it someday soon :)

Very interesting, please let us know if you do. :-)

--
Cheers,
Alex Shulgin

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]