Re: Which is a better way to implement this algorithm?
On Apr 20, 1:43 am, "Ivan Vecerina"
<_INVALID_use_webfo...@ivan.vecerina.com> wrote:
"mike3" <mike4...@yahoo.com> wrote in message
<snip>
Assuming no truncation (because of adequate pre-allocation
of result digits), the same algorithm can be written as:
for( unsigned aPos = 0 ; aPos<aLength ; ++aPos )
{
Digit carry = 0;
TwoDigit const aDig = a.digits[aPos]; //NB: cast "hidden" here
for( unsigned bPos = 0 ; bPos<bLength ; ++bPos )
{
TwoDigit mul = aDig * b.digits[bPos]
+ this->digits[aPos+bPos]
+ carry;
this->digits[aPos+bPos] = mul & digitBitMask;
carry = ( mul >> dig=
itBitCount );
}
this->digits[aPos+bPos] = carry;
}
There are many correct ways to write this, and some are
probably better is some ways than this example. But I
hope that you will find it useful.
In any case, given the very acceptable complexity of this loop,
I would not bother breaking it up or adding layers of
encapsulation.
Unfortunately, I noticed this last implementation where the
cast is "hidden" seems to lose performance. My benchmark
returned 17 seconds for this vs 12 for a similar routine where the
only difference is the cast is not "hidden" by assigning the digit
to a TwoDigit64.
I'm wondering if maybe this is because the compiler has done
a 32x32 multiply when using the one-liner with internal cast, and
returns 64-bit result, but when you try to "hide" the cast it tries
to do a 64x32 or even 64x64 multiply since it sees one 64-bit
operand coming in, which wastes time. I guess it depends on the
optimizer. In the one with the cast the optimizer can "see" that
both operands are 32-bit, so it could "know" to only do 32x32->64
instead of 64x32->64 or 64x64->64 (mod mul by 2^64).
Is this a good theory? Should I go with the casts?