Re: Preventing Denial of Service Attack In IPC Serialization

From:

Le Chaud Lapin <jaibuduvin@gmail.com>

Newsgroups:

comp.lang.c++.moderated

Date:

Mon, 4 Jun 2007 15:35:45 CST

Message-ID:

<1180984736.931497.89410@k79g2000hse.googlegroups.com>

On Jun 4, 6:16 am, jlind...@hotmail.com wrote:

You're making a mountain out of a molehill here, Mr Rabbit :)

IPC systems commonly use a concept of a message, with a header and a
payload. Among other things, the header would contain the length of
the payload. When the server receives a message from a client, it
reads the header, and checks the payload length against a preset
limit. After that, it proceeds with deserialization of the payload,
and because it already knows the length of the payload, eg 4196 bytes,
it knows that it should not accept eg a single byte string claiming a
length of 5000, or a double byte string claiming a length of 2500,
etc.

That's not really the problem. You are thinking of a single packet
representing information.

Let's have a more concrete example. Let's say a C++ programmer has
the task of defining serialization for a String class. This
programmer is told that the type used to represent the length of the
string is 'unsigned int'. We all know that, on many computers,
'unsigned int' is 32 bits, so that's 4,294,967,296 states, for maximum
length around 4.9 billion characters. Now technically, a program
running on a single computer would be able to allocate strings of this
size and use them with no problem, but 4.9 is quite a lot, so let's
reduce the amount that we might "typically" use in program to 1
million characters. Still extremely long, yes, but not so long as to
be inconceivable, at least in some context.

End the end, ones the entire string is serialized from one machine to
another, the string at the target node *will* have 1 million
characters allocated to it, no matter what intermediate steps were
used to encode the length, number of characters in a particular
packet, etc. For those that keep saying "just use realloc" - please
reconsider. Realloc those not get rid of the problem. The problem is
that, as Lorens Veen pointed out, at the end of the day, there will be
1 million-byte string at the target computer, or there will not be.

Now, if you nit-pick at every single serialization that involves
memory consumption on the target machine at a primitive-by-primitive
level, you _might_ be able to finally come up with a...ahem...solution
that does not look like two cats fighting.

But that is not the point of serialization. The point of
serialization has been...

"If you have a strings, and you want to serialize it from Node A to
Node B...write.."

Socket << s;

Now if your serialization code blindly allocates buffers of arbitrary
size, then you obviously have a problem in your serialization code.
You need to improve it to be aware of the payload length of the
current message being processed. I'm curious as to why you think that
is a big deal?

Someone who writes the serialization function for a string will do
exactly that. They will blindly allocate little tiny 16-byte buffers
one by one until the entire 1-megabyte string is sent, using realloc
as some have suggested (...as if that would help).

If this is not clear, I suggest you think about how you would write
the serialization function that is to later be used by 100,000
programmers, and ask yourself if you will not have the problem
described here. Then, after that, imagine doing the same thing for
200+ basic C++ classes, and think about what the code will look like.

It will look like nothing, because there will not be any code. You
will not know what limits to place on length of string, number of
elements in a list<>, number of elements in matrix<>, number of
elements in a nonce<>...

-Le Chaud Lapin-

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]