Re: write binary representation to output

From:

"Alf P. Steinbach" <alfps@start.no>

Newsgroups:

comp.lang.c++

Date:

Sun, 20 Apr 2008 00:51:12 +0200

Message-ID:

<M6SdnZltsaj_6pfVnZ2dnUVZ_r2nnZ2d@comnet>

* James Kanze:

On 19 avr, 10:51, "Alf P. Steinbach" <al...@start.no> wrote:

* wongjoek...@yahoo.com:

I was wondering how in C++ the code would look like if I
want to write the binary notation of a unsigned char to
standard output or any other basic data type, like int,
unsigned int, float and so forth.

E.g.

<code>
#include <iostream> // std::cout, std::ostream
#include <ostream> // operator<<, std::endl
#include <bitset> // std::bits
#include <climits> // CHAR_BIT

static unsigned const bitsPerByte = CHAR_BIT;

template< typename T >
struct BitSize
{
enum { value = bitsPerByte*sizeof( T ) };
};

Just curious, but do you really need this?

It's a templated compile time constant.

std::bitset requires a compile time constant as template parameter.

Silly design and I'd use something else if available in standard library and
more reasonable (constrained to only valid data) than std::string.

template< typename T >
std::bitset< BitSize<T>::value > bitsetFrom( T const& v )
{
     typedef std::bitset< BitSize<T>::value > BitSet;

     BitSet result;
     unsigned char const* p = reinterpret_cast<unsigned char const*>( & v );

And why this?

Mostly in order to deal with 'float', as the OP requested.

Otherwise std::bitset can be constructed directly from the value.

// Uses little-endian convention for bit numbering.

Which I think the standard requires.

I'm not sure about that, but it would be nice if the standard has requirements
that means a direct construction of bitset from e.g. int produces same result as
this function. My intention was to not violate such requirements if they exist.

The problem is that you're
also assuming little-endian for the byte order, which is the
exception, not the rule (in terms of number of architectures,
not number of machines).

Uhm, sorry, there is no such thing as little-endian with some other byte order.

Little endian means bit numbering increases in same direction as memory
addresses, for any size unit.

     for( size_t i = sizeof(T)-1; i != size_t(-1); --i )
     {
         result <<= bitsPerByte;
         result |= BitSet( p[i] );
     }
     return result;
}

Maybe I'm misunderstanding something, but when someone says
somthine like "binary notation to standard out", I imagine
something like "00011100" (for 0x1C).

Yes?

I'm not really sure what
he's looking for when he mentions float, but for the unsigned
integral types, something like the following should do as a
first approximation:

    template< typename T >
    class Binary
    {
    public:
        explicit Binary( T value )
            : myValue( value )
        {
        }
        friend std::ostream&operator<<(
            std::ostream& dest,
            Binary< T > const& value )
        {
            T tmp = value.myValue ;
            std::string s ;
            do {
                s += '0' + (tmp & 1) ;
                tmp >>= 1 ;
            } while ( tmp != 0 ) ;
            reverse( s.begin(), s.end() ) ;
            dest << s ;
            return dest ;
        }
    } ;

    template< typename T >
    inline Binary< T >
    binary( T value )
    {
        return Binary< T >( value ) ;
    }

Well, see below: above doesn't really require a class or anything. But I'm not
so concerned about that as I am that both your and Juha's solution mixes data
representation and i/o. I'd rather prefer a member function that returns a pure
data representation of the binary (I used a bitset, but string, although not
ideal in the sense of constraints on value, would be acceptable).

[Usage]:

unsigned i = 42 ;
std::cout << binary( i ) << std::endl ;

The code I posted earlier mainly tackles float and double in addition to integrals.

For the example above you don't need such code, because you can just do

unsigned const i = 42;
std::cout << std::bitset<CHAR_BIT*sizeof(i)>( i ) << std::endl;

Note the reduction in number of lines, to just 2 (no support class). :-)

Most of the formatting flags (e.g. width) are handled correctly,
by the << operator for string.

Handling signed values is a bit more tricky, because you need
the unsigned equivalent for the tmp, or else some special code
for handling the the fact that on most machines, there is one
negative value which doesn't have a positive corresponant. (If
you're lucky enough to be working on a 1's complement machine or
a signed magnitude machine, there's no problem.)

Well, I'm not sure that that's really a problem with your code.

I think your code would work just fine (for integral types).

Anyway, my code works just fine. :-)

Cheers,

- Alf