Re: NIO multiplexing + thread pooling

From:

Robert Klemme <shortcutter@googlemail.com>

Newsgroups:

comp.lang.java.programmer

Date:

Wed, 28 Sep 2011 08:21:42 +0200

Message-ID:

<9efsnrFoviU1@mid.individual.net>

On 27.09.2011 21:52, Tom Anderson wrote:

On Sun, 25 Sep 2011, Robert Klemme wrote:

On 09/24/2011 09:46 PM, Tom Anderson wrote:

On Sat, 24 Sep 2011, Giovanni Azua wrote:

would it increase performance to setup a Thread Pool and associate one
SelectionKey to one specific Thread in the pool so that all read/write
operations for each specific channel go through one specific Thread
from the Pool?

That would not be a thread pool. That would be a thread-per-connection
model. I can't say that the performance would be worse than with a real
thread pool, but i doubt it would be better than using blocking IO.

As far as I can see Giovanni did not say that there was a 1:1
relationship between threads and channels.

I think i read "associate one SelectionKey to one specific Thread" as
meaning that, but evidently wrongly.

The approach does actually make sense if there is a fixed limit on the
number of channels one thread is responsible for. The idea being that
a single thread can pull of just that much IO and if the number of
clients is not limited a single thread may indeed be the bottleneck.
On the other hand this approach does create less threads than the
thread per connection model with blocking IO.

Hold on - there are two things here.

One, an N:M threading model, where you have N connections being served
by M threads, where N >> M. That, i readily agree, could offer better
performance than N:N or N:1 models.

OK.

But two, the idea that the M threads should have affinity for
connections - that, in effect, we would have an M*(N/M:1) model (i'm
sure that notation is crystal clear!). That, i am skeptical about.

Why?

If you're processing the channels in the selector thread (in which
case it's not really a selector thread), then it could be a
bottleneck. So, you can have several of them; Selector.select() is
threadsafe, so with a little attention to locking, you can have many
threads selecting and then working. This is called the
leader/followers model.

I think the approach with fixed responsibilities of selector threads
to multiple channels scales better than using a single Selector for
all channels from multiple threads.

Yes. The latter is sort of N:1:M, and that has twice as many colons,
where colons are communication overhead.

More importantly colons are _synchronization overhead_!

But i don't see why the thread-affine M*(N/M:1) model would be any
faster than the simpler N:M (leader/followers) model.

Because you reduce synchronization overhead. If you have 1 or X threads
doing the selecting and distributing the work over M >> X handler
threads for N >> M channels then you have much higher potential for
collisions than if you have X * (1 selector for M/X handler threads).
Basically this is a way to partition the application into independent parts.

The point at which this effect has negative impact depends of course on
the characteristics of the communication which I have mentioned
elsewhere IIRC. These are: the number of channels, of course, the
message rate and message size per channel. in other words, the higher
the throughput per channel and the higher the number of channels the
earlier you will see negative effects of too many threads having to go
through the synchronization necessary to distribute the work.

It will be tricky though to get changes in the number of open channels
dealt with in a way that the whole partitioning model is not violated
too much but resources (threads) are not wasted to the point that there
is just one handler thread per channel which could certainly happen if
there was no rebalancing (e.g. all but one channels a thread is
responsible for have died and the channel is not reassigned to another
group).

Interesting stuff. :-)

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/