Re: bytes to unsigned long

From:

James Kanze <james.kanze@gmail.com>

Newsgroups:

comp.lang.c++

Date:

11 May 2007 00:38:54 -0700

Message-ID:

<1178869134.694610.113770@l77g2000hsb.googlegroups.com>

On May 10, 10:48 pm, Gianni Mariani <gi3nos...@mariani.ws> wrote:

On May 10, 6:56 pm, James Kanze <james.ka...@gmail.com> wrote:

On May 9, 9:35 am, Gianni Mariani <gi3nos...@mariani.ws> wrote:

moumita wrote:

I need to convert 4 bytes to an unsigned long.
Suppose I have one array like unsigned char buf[4].I need to convert
these 4 bytes into a single
unsigned long. Is the following piece of code is right??Or is it a
right approch to do that??
unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;
Waiting for your suggestions.

You may need to worry about endianness...

His code handles endianness transparently. That's why he wrote
it like that.

Are you sure it should not be 0,1,2,3 instead or 3,2,1,0 ? i.e. is
the wire order b/e or l/e ?

It depends on the protocol. Presumably, his code is specific to
the protocol. His code implements big endian, which is correct
for all of the Internet protocols, for fixed width integers in
BER, and for most other protocols. (FWIW: I don't know of a
small endian protocol.)

The only choice we need to make in the
NetworkOrder class is wether a true or a false is needed. The
NetworkOrder class may have many other issues (it's not really
copiable - but you never really should copy it, It's strictly UB but
it works and will need to continue to work (due to ABI issues) for a
very long time),

Your code assumed two possible orders, both for the line and for
the internal representation. In practice, there is only one for
the line, except perhaps for some special in house protocols.
On the other hand, I've actually seen 3 different internal
orders (not just 2). His code is transparent to the internal
ordering.

I attached an example of how you can do it. It's kind of the whole h=

og,

it allows you to simply re-interpret cast and read the value in the
correct byte order.
[xx_endian.cpp]

template <class base_type, bool wire_is_big_endian = true >

Question: we're talking about a four byte entity here. There
are 24 different byte orders possible. I've actually seen at
least three. How do you represent this with a bool?

I have only seen 2 endiannesses that *I* have ever needed to support.
If someone cares about different orders, they're welcome to extend the
class.

Fine. I've actually seen and used three different internal
orderings. All on very widely used machines---nothing exotic.
(But you've probably never heard of MS-DOS, or PDP-11's. All
the world is Windows.)

His original code was much cleaner, easier to understand, and
far more portable.

You know better than to say that to me.

Why? Because you know it all, and won't listen, even to people
who have considerably more experience than you. (The code you
posted is what I would consider amaturish, and would certainly
fail code review anywhere I've worked.)

The "Mariani Minimum Complexity Proposition" suggests that any
complexity you can place in a library is better than placed in all
other locations in the code. Why and/or when is "std::string" better
than "char *" ?

So what does that have to do with anything here. You've got a
block of extremely hard to read, hard to modify, overly complex
code which doesn't handle as many real cases as the original.

i.e.

unsigned long val = wire_buffer.val;

and

wire_buffer.val = val;

is a whole lot easier to write and maintain than:

unsigned long temp;
temp= (unsigned long) buff[3];
temp | =((unsigned long) buff[2]) << 8;
temp | =((unsigned long) buff[1]) << 16
temp | =((unsigned long) buff[0]) << 24;

Obviously, this is in a library somewhere. The use is (almost)
exactly the same. (Actually, my own code for this is in an
ixdrstream/oxdrstream class, using the iostream idiom. So you
write:

source >> val1 >> val2 ...

where source is an ixdrstream, using a streambuf connected to
the socket.)

We're talking here about the code you put into the library, not
about the interface of the library.

... the other 6 lines of code for writing it.

Oh - and if you every need to support one of those other 22 endian
types, all the code is in one place to fix that.

The trick is, of course, that his code handles the internal
representation transparently, regardless of what it is. Neither
yours nor his (nor mine) handle "exotic" representations,
however. Some of which (e.g. variable length ints in BER) are
fairly widespread.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S=E9mard, 78210 St.-Cyr-l'=C9cole, France, +33 (0)1 30 23 00 34