Re: NIO multiplexing + thread pooling

From:

Robert Klemme <shortcutter@googlemail.com>

Newsgroups:

comp.lang.java.programmer

Date:

Sun, 25 Sep 2011 11:33:02 +0200

Message-ID:

<9e8aqfFnorU1@mid.individual.net>

On 09/24/2011 09:46 PM, Tom Anderson wrote:

On Sat, 24 Sep 2011, Giovanni Azua wrote:

would it increase performance to setup a Thread Pool and associate one
SelectionKey to one specific Thread in the pool so that all read/write
operations for each specific channel go through one specific Thread
from the Pool?

That would not be a thread pool. That would be a thread-per-connection
model. I can't say that the performance would be worse than with a real
thread pool, but i doubt it would be better than using blocking IO.

As far as I can see Giovanni did not say that there was a 1:1
relationship between threads and channels. The approach does actually
make sense if there is a fixed limit on the number of channels one
thread is responsible for. The idea being that a single thread can pull
of just that much IO and if the number of clients is not limited a
single thread may indeed be the bottleneck. On the other hand this
approach does create less threads than the thread per connection model
with blocking IO. Whether it is actually worthwhile considering the
higher implementation effort of this scheme vs. the simplest approach
(thread per connection with blocking IO) is another question. If the
number of channels is limited in some way I'd rather go for the simple
implementation with blocking IO.

Some more reading
http://www.scs.stanford.edu/~dm/home/papers/dabek:event.pdf
http://www.eecs.harvard.edu/~mdw/papers/events.pdf

There is another paper which I cannot find at the moment and which
demonstrates advantages of a thread based implementation with a
threading library which has rather low overhead. I'll try to dig it up
after the weekend.

From all code examples/tutorials/books I have reviewed online the
"Selector Thread" seems like a bottleneck to me.

If the selector thread's only job is to identify channels which are
ready to be worked on, handing them off to worker threads to be
processed, then it doesn't really have a lot of work of its own to do,
and so it's unlikely to be a bottleneck.

If you're processing the channels in the selector thread (in which case
it's not really a selector thread), then it could be a bottleneck. So,
you can have several of them; Selector.select() is threadsafe, so with a
little attention to locking, you can have many threads selecting and
then working. This is called the leader/followers model.

I think the approach with fixed responsibilities of selector threads to
multiple channels scales better than using a single Selector for all
channels from multiple threads. But that may depend on the use case.
Especially in light of short lived connections I can see some overhead
for the assignment of channels to threads. Also, rebalancing might be
an issue in case one thread has significantly less channels to handle

One thing you could consider, if using a single selector thread and a
pool of worker threads, is trying to create some affinity between
channels and threads, so that the same thread always handles the work on
a particular channel (and on several channels); on a multiprocessor
machine, this should increase cache performance, but is not otherwise
particularly valuable. This is not entirely straightforward, because of
thread and channel starvation issues; you should read up on
work-stealing if you want to do that.

It's an interesting topic and getting it right (high throuput, low
resource usage) is certainly not easy.

Kind regards

robert