Re: Compatible codes for both Visual Studio 2005 and gcc

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Mon, 24 Sep 2007 13:42:15 -0000

Message-ID:

<1190641335.956242.187470@r29g2000hsg.googlegroups.com>

On Sep 24, 11:31 am, Ian Collins <ian-n...@hotmail.com> wrote:

James Kanze wrote:

And not really related to this thread, but... Since you seem to
use unit tests even more intensively than I do, how do you
handle the case where the full unit tests take minutes, or even
hours? I have the case in some of my UTF-8 code, where for
practical reasons related to the internal implementation, I want
to hit at least one character in every block of 64 before
releasing the code for other components to use. On one of the
machines I use, this results in a unit test of several hours,
and of course, in my personal iterations, internal to the
component, I probably don't need to be this thorough: one
character of each length would suffice, with a few added limit
cases. So how do you handle a case like this? Because it's
driving me up the wall.

TDD probably wouldn't result in unit tests that of that nature, that
sounds more like a higher level test.

It's really rather irrelevant what TDD would result in, since I
don't use it. I do use extensive unit tests, and I want them to
be as complete as possible. Testing every Unicode value is
perhaps a little too exhaustive, but the implementation works in
blocks of 64 characters, and testing one in each block isn't
that excessive. For unit tests which are only run when the
component is exported (which shouldn't be that often). But it
still takes too much time for the iterative phase of
development.

TDD tests drive the design, so they tend to be short logic
proving tests, stepping through a sequence of events.

If you're claiming that TDD uses incomplete tests, that doesn't
address my issue. Of course, all tests are incomplete, in one
way or another. (Although in this case, I almost could do
exhaustive testing---there are only about a million possible
inputs. But it would take too much time on most of the target
platforms.)

Perhaps a different question would be: what is the minimum test
set which would give me a reasonable amount of assurance. The
problem is expanding my SetOfCharacter class (see
http://kanze.james.neuf.fr/doc/en/Text/html/index.html) to
support UTF-8. Where the current version takes a char/unsigned
char/int as argument, the new one has a template function taking
two iterators and a function taking a special type UTF8Char,
which encapsulates a single UTF8Char. In the current version, I
check every bit in the resulting set. I'm currently doing this
in the UTF-8 version as well, and that's what's expensive.

(It would actually be interesting to see how TDD handles this
sort of problem. The original "requirement" from the user is
simple: do the same thing as the existing class, but handle
UTF-8. Of course, that means that it doesn't do the same thing,
so some refining is necessary. I came up with something which
replaced e.g.:
    //! Adds a character to the set.
    //!
    //! \param ch
    //! The character to be added.
    //!
    //! \post
    //! <tt>contains( ch )</code>
    //
-----------------------------------------------------------------------
    void add( char ch ) ;

    //! Adds a character to the set.
    //!
    //! \param ch
    //! The character to be added.
    //!
    //! \post
    //! <tt>contains( ch )</code>
    //
-----------------------------------------------------------------------
    void add( unsigned char ch ) ;

    //! Adds a character to the set.
    //!
    //! \param ch
    //! The character to be added.
    //!
    //! \pre
    //! <tt>ch >= 0 && ch <= UCHAR_MAX</tt>
    //!
    //! \post
    //! <tt>contains( ch )</code>
    //
-----------------------------------------------------------------------
    void add( int ch ) ;

    //! Adds all of the characters from a string to the set.
    //!
    //! \param s
    //! A string containing the characters to be added.
    //!
    //! \post
    //! \code
    //! for ( int i = 0 ; i < s.size() ; ++ i )
    //! contains( s[ i ] ) ;
    //! \endcode
    //
-----------------------------------------------------------------------
    void add( std::string const& s ) ;

    //! Adds all of the characters in a sequence to the set.
    //!
    //! \param begin
    //! Begin iterator of the characters to be added.
    //!
    //! \param end
    //! End iterator of the characters to be added.
    //!
    //! \post
    //! \code
    //! while ( begin != end )
    //! contains( *begin ++ ) ;
    //! \endcode
    //
-----------------------------------------------------------------------
    template< typename FwdIter >
    void add( FwdIter begin, FwdIter end ) ;

    //! Adds all of the characters in the domain from the set.
    //!
    //! \post
    //! For all <tt>ch</tt> in the domain:
    //! <tt>contains( ch )</tt>.
    //
-----------------------------------------------------------------------
    void add() ;
with:
    //! Adds one or more characters to the set.
    //!
    //! \param begin
    //! Iterator designating the first byte of the first
    //! character.
    //!
    //! \param end
    //! Iterator designating one past the last byte of the
    //! available characters.
    //!
    //! \param maxCharCount
    //! The maximum number of characters to be processed in
    //! the sequence. By default, this is <tt>INT_MAX</tt>
    //! (practically speaking: infinity), which means that the
    //! sequence will be processed completely, as a string.
    //! It is possible, however, to limit the number of
    //! characters processed; by specifying <tt>1</tt>, for
    //! example, the constructed set will consist only of the
    //! single character at the start of the sequence.
    //!
    //! \post
    //! For all characters <tt>ch</tt>: <tt>contains(ch)</tt>
    //! is <tt>true</tt> if <tt>ch</tt> is one of the first
    //! <tt>maxCharCount</tt> characters in the sequence.
    //
-----------------------------------------------------------------------
    template< typename FwdIter >
    void add(
                            FwdIter begin,
                            FwdIter end,
                            size_t maxCharCount =
infinity ) ;
    void add( UTF8Char const& ch ) ;

(You'll note that the second function isn't documented. It was
originally meant to be private, used by the public function.
But it turned out that it was generally useful, so I moved it up
to public. It is, of course, where all of the actual behavior
is situated, since I definitely don't want any real behavior,
which might need corrections, in a template.)

What kind of tests would you write that would generate design
here? And what kind of tests are necessary to ensure that the
results really are correct; i.e. that any character that was
previously a member of the set is a member after this function,
and that any character in the sequence which was passed in is
also a member of the set after the function. And that any
character which wasn't in either isn't in the results. An
exhaustive test is, quite clearly, impossible. (There are
roughly 2^1000000 different possible initial values for set, and
the sequence passed as an argument is conceptually limited only
by the size of the memory.) So what do you have to test to 1)
define a design, and 2) be relatively sure that if the test
passes, you haven't accidentally introduced any new errors?

For example, I've just add an all modules failed flag to a
system, to add this I used the following simple tests:

testHaveAllRegisteredModulesFailedFalseWithNoModules();
testHaveAllRegisteredModulesFailedFalseWithGoodModules();
testHaveAllRegisteredModulesFailedFalseWithOneGoodAndOneBadModule();
testHaveAllRegisteredModulesFailedTrueWithAllBadModules();

The application has 1100 tests which run in about 50 seconds
(unoptimised) on my machine. If the test time gets above 60
seconds, we profile and optimise.

I've actually done some profiling, and have doubled the speed.
On one of my machines, that meant that it went from six hours,
to three. Obviously, I'm testing too much (and maybe not enough
as well).

The test framework (CppUnit) makes it easy to run tests for
one module, witch we usually do if we are only working in one
area.

This is a single, very low level module.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34