Re: An Executor-like structure providing more than threads
On Sat, 16 Jan 2010, Peter Duniho wrote:
Tom Anderson wrote:
[...]
I could have written this with the Downloaders being the active party,
going to a queue of URLs, pulling one off, downloading it, then going back
for another. Or i could have put Downloaders in a pool, and had the
submission mechanism pull them out and hand URLs to them. But i really
wanted to be able to use all the cool stuff in ExecutorService, like
getting Futures and having orderly shutdown, and a properly controllable
thread pool and so on. So what i did was subclass Thread to add the other
bits (the HttpClient and so on), and then, in the tasks, do something like:
((DownloadThread)Thread.currentThread()).getHttpClient()
And so on. I thought that was quite clever, although it is clearly also
entirely bletcherous.
I don't see anything fundamentally wrong with the idea. So long as you
really have a one-to-one relationship between the "downloader" and the
"thread" parts, maybe it does make sense to have a "downloader thread".
As far as the specific line of code you posted, the one change I would have
made is to have an appropriate static method in DownloadThread:
class DownloadThread extends Thread
{
public static DownloadThread currentThread() throws UnsupportedOperationException
{
Thread thread = Thread.currentThread();
if (thread instanceof DownloadThread)
{
return (DownloadThread)thread;
}
// Alternative to the below would be to just make the
// cast and let the bad cast exception propagate up
throw new UnsupportedOperationException("current thread is not a DownloadThread");
}
}
That way you don't have a bunch of explicit casting all over the place.
That's certainly an improvement.
Though, that said it seems to me that most if not all of the time, you should
not need the currentThread() method anyway, because you should be executing
methods within the DownloadThread instance already.
Ah, no, because the idea is that the DownloadThread is infrastructure to
support all sorts of generic downloading tasks. One might be to get a file
from a given URL; another might be to get a series of files from the same
server, yet another might download one thing, parse it to find the URL to
another thing, then download that, etc. Moreover, a single DownloadThread
might do different tasks in succession. That means having code outside the
DownloadThread itself.
Although i admit that in this app, there was only one type of task. I
would like to turn this into a generic utility, though.
Cases where client code needs to get specific output of the
DownloadThread class would be handled instead via some mechanism where
you don't need to concern yourself with the current thread. For
example, a listener/event API, or even just a simple completion
callback, either approach in a Future implementation where the relevant
instance of the DownloadThread class is delivered via some mechanism
other than checking the current Thread instance.
Any thoughts? What's the right way to do this?
I guess the more I think about it, the more I wonder why you should ever need
to make the assumption that the DownloadThread object of interest is the same
as the current Thread object.
I also think that you could implement a Downloader pool that is not so
explicitly tied to a Thread pool. It seems to me that the Thread pool
dependency is an implementation detail, and should be abstracted such that
the client of the Downloader pool should not need to know or concern their
self with that detail.
I would think that the Downloader pool client would simply submit the URLs,
and receive notification via some mechanism of completion defined by the
Downloader pool. That the Downloader pool is itself depending on a Thread
pool behind the scenes should be irrelevant and unknowable from the API
alone. To make the Downloader and Thread one and the same object seems like
a leaky abstraction to me.
Could be. If by 'thread pool', you mean something like an executor, where
there is explicitly a pool of threads, then i can't see how you'd
implement the Downloader pool - would Downloaders fetch threads from the
pool when they need to run or something? Store the download task in
themselves, then submit themselves to the executor, having a run method
which runs the download task? Yes, that would work.
I wonder if there's a design pattern there - objects which submit
themselves to be executed to service a request from another object,
converting method parameters into instance fields. Sort of a variation of
Active Object. I haven't seen it enough times to say so, really, but it
has the feel of a pattern, somehow.
Thinking about it, i could probably do away with pools altogether and just
have active threads pulling tasks from a queue and executing them; these
tasks could have a wider interface than Runnable, through which a
HttpClient and buffer could be passed:
interface Download {
public void perform(HttpClient client, byte[] buffer);
}
class DownloadWorker implements Runnable {
private BlockingQueue<Download> downloads; // shared
private HttpClient client;
private byte[] buffer;
public DownloadWorker(BlockingQueue<Download> downloads, HttpConnectionManager connMgr) {
this.downloads = downloads;
client = new HttpClient(connMgr);
buffer = new byte[1024];
}
public void run() {
// exception handling not shown
while (true) {
Download dl = downloads.take();
dl.perform(client, buffer);
}
}
}
BlockingQueue<Download> downloads;
HttpConnectionManager connMgr;
// pool management also not shown
for (int i = 0; i < numThreads; ++i) {
new Thread(new DownloadWorker(downloads, connMgr)).start();
}
Look ma, no executor!
That said, I try to be pragmatic. There are certainly others who are
more OOP-Nazi than I am (and I preemptively reject the rampant
over-application of Godwin's law that goes on in this newsgroup) and who
might insist against an unnecessary dependency in the object hierarchy
like that.
But while I personally probably wouldn't implement it that way, I'm not
comfortable asserting that your own choice is wrong. Especially not
having a full view of the entire system you're implementing, and
especially having experience with your other contributions to the
newsgroup (granted, sometimes having a positive reputation can interfere
with your ability to receive good, critical feedback because people make
too broad an assumption that you know what you're doing :) ). Maybe for
your particular system, this really is a good design.
I refer you to the timestamp on my original message - it was getting on
for one in the morning when i wrote that. I wouldn't for a second dream of
trying to defend that code!
tom
--
Why did one straw break the camel's back? Here's the secret: the million
other straws underneath it - it's all mathematics. -- Mos Def