Re: Converting float to long bits

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

Thu, 10 Jun 2010 01:46:03 -0700 (PDT)

Message-ID:

<6153e970-752a-4dd8-b1ed-df05128fb243@q12g2000yqj.googlegroups.com>

On Jun 9, 9:05 pm, Marcel M=FCller <news.5.ma...@spamgourmet.com> wrote:

James Kanze wrote:

Apparently, the compiler has inlined the code for the function.
Putting it in a separate translation unit should solve this
problem, most of the time.

As long as the function implementation does not share a similar problem.
In fact it does not.

[...]

given toI32Bits, defined as above, the only reason it might not
work is because the compiler authors are being perverse, and are
trying intentionally to trip you up, even at the expense of
ignoring the intent of the standard.

:-)

A bug report would not be that bad.

The behavior is intentional.

[...]

The C functions frexp and ldexp provide a defined way to
operate with floating point values.

If you want to be really, really portable, such functions are
the only way. But suppose you need to output floats in IEEE
binary format (e.g. for XDR), and your portability requirements
only include Windows and the major Unix platforms. All of which
use IEEE internally. Some ugly type punning, like the above,
can be significantly faster (and results in a lot shorter code).

Anyway, one safe way of doing it is by using memcpy; the
compiler isn't allowed to mungle that one, since the integer and
the float are in fact two different variables, and the copy uses
void*, which the compiler must consider as a possible alias,
regardless of the other type.

A union should do the job as well. And it results more or less
the same UB as before.

There's a long history in this. Historically, before ISO C,
a union was the "approved" way of doing such type punning
(although I've used pre-standard compilers on which it didn't
work). For some reason, the ISO C committee more or less
blessed the pointer cast approach---still undefined behavior,
but the intent was that the behavior be what someone familiar
with the architecture would expect. The idea was, I think, to
allow some sort of "discriminating" implementation of a union,
for debugging; one which would crash the program if you accessed
a different member than the last one assigned to.

Practically, the actual wording was such that it guarantees
things I don't think the committee meant to guarantee, e.g.:

    int f(float* f, int* i)
    {
        int retval = *i;
        *f = 3.14159;
    }

and in a different translation unit:

    union U { float f; int l; };
    U u;
    u.i = 42;
    printf( "%d", f( &u.f, &u.i ) );

As currently worded, both C and C++ claim that this is
guaranteed to work. G++ breaks it, and practically speaking, if
the non-aliasing guarantees are to be of any use at all, it
should be legally broken; in this case, I'd go with g++, and say
that it is the standard that is broken in requiring the above to
work.

long toI32Bits(float value)
{ union
        { float f;
                long l;
        } data;
        data.f = value;
        return data.l;

}

does the job. Even when the function is inlined. Of course,
the code is a bit less efficient.

It shouldn't be, once the compiler gets through with it. It
offers no more guarantees that the original version; in fact, it
offers less, in the sense that there isn't even an intent in the
standard to support it. G++ does guarantee it (provided all of
the accesses to the union are in the same function, or some
similar restriction), but I've used other compilers where it
didn't work, and the cast did.

From a practical point of view, a responsible compiler writer
will make both the cast and the union work, *provided* all use
is local to the function, where the compiler can easily detect
that the aliasing guarantees are broken.

--
James Kanze