Re: question regarding java puzzlers #2

From:

Chris Smith <cdsmith@twu.net>

Newsgroups:

comp.lang.java.programmer

Date:

Mon, 23 Apr 2007 12:50:29 -0600

Message-ID:

<MPG.2096ade16420520898986a@news.altopia.net>

<blmblm@myrealbox.com> wrote:

I would have guessed [...] it would be possible to
choose a value decimalNum in that range that could be expressed
in base 10 with a number of significant digits appropriate for the
number of bits of mantissa, and that converting a string representation
of decimalNum to floating point would yield floatNum again.

It is possible to define such a reversible conversion. In fact, the
methods Double.toString and Double.parseDouble are exactly such a pair
of reversible transformations between double and String, with no loss of
information. The fact that the conversion is reversible doesn't mean it
is correct, or that it's a good idea to use it for intermediate results.
Adding one to integers is also a reversible transformation, but it still
gives you a wrong answer.

In essence, there are several distinct, but similar, scenarios where you
want to convert something to decimal. Different techniques are
appropriate for different scenarios.

Scenario I: You need a lossless conversion back and forth between double
and a decimal type that doesn't lose information. This is what
Double.toString and Double.parseDouble are designed for. BigDecimal's
valueOf(double) also does it.

Scenario II: You need a conversion to decimal for values that are read
from a user or from some device that gives input in short decimal
numbers. If the values are stored in the double data type (which is
only appropriate if their original precision is much less than the
precision of the double data type), and you later want to convert them
to BigDecimal or String, then BigDecimal.valueOf or Double.toString are
the way to go, because the bias toward short decimal representations
actually helps you recover the intended value.

Scenario III: You are working with values from a source that's not
biased to provide round numbers in decimal, and you need to communicate
them to another source that won't be aware of your internal data types
so that a completely reversible conversion isn't feasible.

This is the tougher case. There are trade-offs between communication
bandwidth/size and accuracy, but on average it's better for accuracy to
use the full decimal value of your double -- that is, new BigDecimal
(val), or if you need a string, new BigDecimal(val).toString(). The
reason this is better "on average" instead of "all the time" is that as
you noted, the double already contained binary rounding error. It's
possible that you'll get lucky, and the decimal rounding error will be
in the opposite direction from the binary rounding error, so that they
cancel each other out -- but smart money is on the rounding error
accumulating instead.

If you want to make appropriate trade-offs between size and accuracy,
you can do it explicitly using setScale.

You have appropriate choices for all of these scenarios. If you're
uncomfortable with the behavior of BigDecimal when it's constructed with
a double parameter, that's understandable, because converting double to
BigDecimal is certainly not a common need at all, and most often happens
in scenario 2 where rounding the number in reversible ways is often the
appropriate choice; in fact, I've only ever used new BigDecimal(double)
for demonstrating things on newsgroups. But it does have a specific
well-defined behavior that can be valuable, and other valuable behaviors
can be achieved in other ways.

Your
experiment below, though, suggests that maybe this isn't possible,
or that's it's trickier than it seems.

Patricia's experiment suggests that rounding to the nearest even decimal
number is likely to harm the accuracy of a resulting calculation when
you didn't have any reason to expect the right answer to have a short
decimal representation to begin with. It doesn't guarantee that
reversing the rounding is impossible (in fact, it is possible, as you
could see by using the doubleValue method on the result).

Actually, Patricia's result was somewhat fortuitous. By rounding to 17
places, she got an answer that was further off. If she'd rounded to 16
or 18 places (or used BigDecimal.valueOf), she'd have gotten something
closer. That's a fluke of the test case she used, though. Here's a
modified test that demonstrates that on average, using BigDecimal's
valueOf, which does as you expected, provides poorer accuracy on square
roots of integers, versus new BigDecimal. In other words, if you
introduce this new decimal rounding inaccuracies after the already-
suffered binary rounding inaccuracies, you will occasionally luck out
and round in the correct direction; but on average, you'll end up
considerably worse off.

public class Test
{
    public static void main(String[] args)
    {
        BigDecimal error = BigDecimal.ZERO;

        for (int square = 1; square < 500; square++)
        {
            BigDecimal bigSquare = BigDecimal.valueOf(square);
            double dRoot = Math.sqrt(square);
            BigDecimal bigRoot = new BigDecimal(dRoot);
            BigDecimal roundedBigRoot = BigDecimal.valueOf(dRoot);
            BigDecimal rawError = bigSquare.subtract(
                bigRoot.multiply(bigRoot));
            BigDecimal roundedError = bigSquare.subtract(roundedBigRoot
                .multiply(roundedBigRoot));

            error = error.add(rawError.abs());
            error = error.subtract(roundedError.abs());

            System.out.println("Error: " + error);
        }
    }
}

Although the first few integers end up closer with the rounded version,
the error eventually does more harm than good. By the time you reach
500, the error is greater by something on the order of 10^-12 when
rounding the second time.

--
Chris Smith