Re: A simple unit test framework

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

7 May 2007 02:36:11 -0700

Message-ID:

<1178530570.990720.36170@q75g2000hsh.googlegroups.com>

On May 7, 10:55 am, Gianni Mariani <gi3nos...@mariani.ws> wrote:

James Kanze wrote:

On May 7, 2:02 am, Branimir Maksimovic <b...@hotmail.com> wrote:

On May 6, 3:03 am, Gianni Mariani <gi3nos...@mariani.ws> wrote:

[...]

Even when you know exactly where the error is, it's sometimes
impossible to write code which reliably triggers it. Consider
std::string, in g++: if you have an std::string object shared by
two threads, and one thread copies it, and the other does [] or
grabs an iterator at exactly the same time, you can end up with
a dangling pointer---an std::string object whose implementation
memory has been freed. The probability of doing so, however, is
very small, and even knowing exactly where the error is, I've
yet to be able to write a program which will reliably trigger
it.

Are you sure. I ran into that problem in an earlier version of gcc's
std::string. I have never seen it since and I did review the
std::string code at 3.0 and I was satisfied that all was well.

I'm sure. I wouldn't swear that it hasn't been corrected in the
latest version, but it was definitely present in 3.4.3. The bug
report is still open; from what I understand, a totally new
string class is in the works, so the current one will not be
corrected.

std::string is not thread safe

Everything in the standard library in g++, from 3.0 on, is
supposed to be thread safe. There is some uncertainty, for some
classes, as to how this is defined, and I hesitated to post the
bug, because I wasn't sure that std::string was supposed to meet
the Posix requirements (although all of the other g++ containers
meet them). That is, however, a bit irrelevant to the
discussion here. Posix compliant thread safety is a reasonable
choice, the current implementation of std::string in g++ doesn't
meet it, and I cannot imagine any possible test which would
display this defect in a reliable fashion.

but this should work:

std::string global1( "A" );

std::string global2( global1 );

void thread1()
{
global1 += "1";
}

void thread2()
{
global2 += "A";
}

That definitly should (and as far as I know, does) work. The
problem is more along the lines of:

    std::string global( "a" ) ;

    void thread1()
    {
        std::string s1( global ) ;
    }

    void thread2()
    {
        std::string s2( global.begin(), global.end() ) ;
    }

Since I'm not modifying global, Posix says that the above should
work (or rather, that it should work if the involved types were
known to Posix). Code review shows that it can fail in the
current implementation in G++. I would very much like to see a
test program where it reliably fails, however.

Consider some variations on DCL. In at least one case, you only
get into trouble if one of the processors has read memory in the
same cache line as the pointer just before executing the
critical code. Which means that the function can work perfectly
in one application, and fail when you link it into another
application.

Yes, but it's exactly this type of test that finds bugs like that.

What type of test? You mean that for each test, you wrap the
code in a different environment, recompile, and relink?

As I mentionned in a response to another poster, it may just
depend on what you consider acceptable quality. I've worked on
critical systems a lot in the past. With contractual penalties
for downtime. So I tend to set my standards high.
Interestingly enough, however, it turns out that developing code
to such high standards is actually cheaper than just churning it
out.

Yes. Better tested code is easier to develop with.

The issue hasn't been raised to date, but...

Good code is easy to understand. Code that isn't easy to
understand fails code review. How does testing verify this?

(In some ways, easy to understand is more important than an
absence of errors.)

... As a rough, back of the envelope figure: correcting an
error found in code review costs one tenth of correcting the
same error found in unit tests,

What are you smoking ? Sure you can find some obvious bugs in
a code review, but I would have already run the unit tests
before I do the code review.

That's doing things the hard way, and is hardly cost effective.
In code review, you are told that you forgot to initialize
variable i in line 1234 of abc.cc. The unit test tells you that
tests 25 through 30 all fail. Which information makes it easier
to find and fix the error?

And a well run code review doesn't find just the obvious bugs.
It also finds those which no test will find. (I found the bug
in the g++ implementation of std::string by reviewing the code.)

... which costs one tenth of
correcting the same error found in integration tests, which
costs one tenth of correcting the same error found in the field.

We can haggle on how many orders of magnitude field errors
cost but I'll agree it's at least 2 orders above unit tests.

My figures are "back of the envelope". The actual values will
vary enormously. The goal is only to give a rough idea.

[...]

It does require a true multi processor system to test adequately.

Not just mp system, but *all* possible mp systems, which is
of course impossible.

It's perhaps worth pointing out that most current low-end MP
systems use a more or less synchronized memory model. This is
not true, however, for Alpha processors, nor the Itanium, nor, I
think, top of the line Sparcs, nor, probably future desktop
processors. That means that the fact that code works on a
current four processor system based on Intel 32 bits means
nothing with regards to future processors.

Only in low level code. Code written above the iron layer should work
just fine.

Dream on.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34