Re: Can I avoid the use of arrays in this situation or do I have to use them?

From:

"Alf P. Steinbach" <alfps@start.no>

Newsgroups:

comp.lang.c++

Date:

Sat, 24 Nov 2007 09:03:15 +0100

Message-ID:

<13kfmm87nd5s0e@corp.supernews.com>

* mike3:

On Nov 23, 3:02 pm, "Alf P. Steinbach" <al...@start.no> wrote:
<snip>

Why do you have the length arguments?

A std::vector has a length, accessible via size() and resize().

I thought resize() is to do just that, resize the vector,
not get it's length.

It sets the length. Plus allocates if necessary, and initializes
elements that become accessible.

Also, I have then length arguments
so I can use lengths different from those of the vector
if one needs to do that for some reason.

Yeah, but /do/ you need that?

Without needing it the passed lengths are just one more thing that can
go wrong.

{
       /* Set up temporary digit buffer */
       Digit tmpbuf[2*MAX_PRECISION]; // use array instead of vector
to
                                      // maximize performance. Vector
                                      // takes forever to constantly
de/
                                      // reallocate.

Have you measured this?

Sure did. Took ~34 secs, for 100,000,000 calls, with all
compiler optimizations turned on.

I just reduced the routine to this:

FG3DError FG3DDigitVector_Mul(std::vector<Digit> &vec0,
                         const std::vector<Digit> &vec1,
                         const std::vector<Digit> &vec2,
                         int length1, int length2)
{
     std::vector<Digit> tmpbuf(length1 + length2);
     return(FG3DError(FG3D_SUCCESS));
}

and just ran it over and over 100,000,000 times to
see how long that allocate/free takes and I was
stunned to find it wastes 34 seconds, more time than
the entire multiplication loop!

Well, it comes down to usage pattern.

If the out-parameter is generally an empty vector or of less capacity
than required for the result, then there will be an allocation anyway in
most cases. And then the most efficient will generally be to declare
the temporary as a vector, and swap at the end.

On the other hand, if the out-parameter is generally of sufficient
capacity, then from a local point of view the most efficient will
generally be to reuse that capacity, i.e. to use a raw array for the
temporary or to build the result right into the out-parameter.

However, reuse of capacity runs the risk of "leaking" memory in the
sense that the capacity is never lowered, only increasing.

So measuring for actual usage, typical usage pattern, is critical.

[snip]

Ensure class invariants in operations so that you don't have to pepper
the code with checks all over the place.

In other words,

   class Bignum
   {
   private:
       std::vector<Digit> myDigits;
   public:
       // Whatever, all operations assume a max length
       // and ensure that max length.
   };

The problem is I need to be able to work on pieces of the
digit string as well as the entire string with some
operations (addition, subtraction).

Sounds like slices.

The standard library isn't very well equipped when it comes to slices:
you may have to build that yourself.

There is std::valarray, which I think supports slices, but that's a
not-quite-finished piece of the library. If it works, great. If not...

And if the operations
ensure the max length does not that require those if()
checks in them to make sure the length parameter passed
does not exceed the limit?

No, the point is to not pass separate length parameters, that whatever
object you're passing knows its length.

But again, it sounds as if you're dealing with slices.

Then, if so, then Bignum (or whatever) needs to support slices,
efficiently. In the operation that creates a logical slice there will
be a length check. But that will then be the only one, centralized.

[snip]

       /* Do the multiplication */
       for(int i(0);i<length1;i++,++v1i)
       {
          TwoDigit carry;
          carry = 0;
          v2i = v2init;
          for(int j(0);j<length2;j++,++v2i)
          {
             TwoDigit
tmp((static_cast<TwoDigit>(*v1i)*static_cast<TwoDigit>(*v2i))
                          +static_cast<TwoDigit>(tmpbuf[i+j])+carry);

This looks rather suspicious.

However, you don't provide definitions of Digit and TwoDigit, so
difficult to say.

TwoDigit can be any unsigned type that has twice the bitlength
of Digit. You could use unsigned int (32 bit) for TwoDigit,
unsigned short (16 bit) for Digit, for example.

Well, the ordinary arithmetic promotions were designed to handle this
kind of stuff.

It's enough that one operand of an operation is of type TwoDigit in
order to force the other operand to be promoted.

This is one case where I'd forsake the newfangled casts for clarity,

TwoDigit tmp = TwoDigit(*v1i)*(*v2i) + tmpbuf[i+j] + carry;

"TwoDigit(*v1i)" is a syntax that was introduced in C++ to make it look
like a constructor call, but is defined for a primitive type as
equivalent to a C style cast "((TwoDigit)*v1i)". I.e. in some other
context it could invoke reinterpret_cast or const_cast or even cast to
inaccessible base. I don't know why it wasn't defined as static_cast.
Probably an oversight or misunderstanding: they bungled it.

             carry = tmp>>DIGIT_SIZE;
             tmp &= MAX_DIGIT;
             tmpbuf[i+j] = tmp;
          }
          tmpbuf[i+length2] = carry;
       }
       /* Copy result back */
       vec0.reserve(length1 + length2);
       std::copy(tmpbuf, tmpbuf + length1 + length2, vec0.begin());

Don't modify out-arguments directly.

Create the result in a local variable, then at the end swap().

For exception safety.

The result is already being built in the temp buffer, that's the
point. And does this mean one cannot resize the output vector if
it cannot hold the full size product? Or will swap() do that
automatically? Is swap() slower than the above?

You can always resize. Whether swap is more or less efficient depends
on the context, in effect on the usage pattern for the calls of this
function. However, given your measurements I think what I wrote above
is too categorical: you'll need to try out both approaches (array temp +
resize + copy, or vector temp + swap) in the actual typical usage,
measuring.

Happily there's not much difference regarding the rest of the code.

Cheers, & hth,.

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?