Re: IO blocking reads: how do you deal with them ?
 
On Mon, 25 May 2009, S?bastien de Mapias wrote:
Anybody already addressed blocking IO-reads in an application ?
We suspect we've met the bug described here:
http://www.mail-archive.com/commons-httpclient-dev@jakarta.apache.org/msg06704.html
but we're not sure... We unfortunately can't change the reference
to the Apache's commons-httpclient.jar we use.
Has someone devised a mechanism to deal with blocking read calls to 
enable the process to carry on whenever it happens ? (call the 
HttpClient.handle()
HttpClient.handle()? I don't believe there's any such method in the public 
API of HttpClient.
-that eventually calls HttpParser.readRawLine ()- inside its own thread 
I guess ?) We use version 1.4.
Calling it in its own thread would be the obvious way to do it. In fact, 
rather than spawning threads, i'd use an ExecutorService, which also gives 
you ways to manage the task. Tan:
public class SomeData {
 	// holds the information you want from the http server
 	public static SomeData extract(InputStream in) {
 		// somehow extracts the data
 		// NB i wouldn't actually make this a static factory method here, this is just for exegesis
 	}
}
public class HttpClientTask implements Callable<SomeData> {
 	private final HttpClient client;
 	private final HttpMethod method;
 	public HttpClientTask(HttpClient client, HttpMethod method) {
 		this.client = client;
 		this.method = method;
 	}
 	public SomeData call() {
 		try {
 			client.executeMethod(method);
 			InputStream in = method.getResponseBodyAsStream();
 			SomeData data = SomeData.extract(in);
 			return data;
 		}
 		// cannot throw checked exception from call(), so must wrap ...
 		catch (HttpException e) {
 			throw new HttpClientTaskException(e);
 		}
 		catch (IOException e) {
 			throw new HttpClientTaskException(e);
 		}
 		finally {
 			method.releaseConnection();
 		}
 	}
}
public class HttpClientTaskException extends RuntimeException {
 	public HttpClientTaskException(Exception cause) {
 		super(cause);
 	}
}
public class DataGetter { // terrible name, change it
 	private static final long TIMEOUT = 5000; // in ms
 	private ExecutorService executor;
 	public SomeData getData(HttpClient client, HttpMethod method) throws HttpException, IOException, TimeoutException, InterruptedException {
 		Callable<SomeData> task = new HttpClientTask(client, method);
 		Future<SomeData> dataFuture = executor.submit(task);
 		try {
 			return dataFuture.get(TIMEOUT, TimeUnit.MILLISECONDS);
 		}
 		catch (TimeoutException e) {
 			dataFuture.cancel(true);
 			throw e;
 		}
 		// you also have to deal with ExecutionException; here's my suggestion ...
 		catch (ExecutionException e) {
 			Throwable cause = e.getCause();
 			if (cause instanceof HttpClientTaskException) {
 				Throwable trueCause = cause.getCause();
 				if (trueCause instanceof HttpException) throw (HttpException)trueCause;
 				else if (trueCause instanceof IOException) throw (IOException)trueCause;
 				else throw (HttpClientTaskException)trueCause; // NB this is impossible - HttpClientTaskException is only thrown for HttpException or IOException
 			}
 			else if (cause instanceof RuntimeException) throw (RuntimeException)cause;
 			else if (cause instanceof Error) throw (Error)cause;
 			else throw new RuntimeException("impossible exception", cause); // NB this is impossible - call() cannot throw checked exceptions
 		}
 	}
}
Apologies for the long-winded exception handling, but that's the way it 
has to be done, sadly. The idea there is that if the HttpClient methods 
throw exceptions, they should be propagated to the caller of getData - to 
make it look as much as possible like getData called them directly, so 
that the whole thread monkey business is transparent.
Another, perhaps simpler, but nastier, way of doing it would be to spawn a 
watchdog thread:
public class Watchdog extends Thread {
 	private final Thread victim;
 	private final long timeout;
 	public Watchdog(Thread victim, long timeout) {
 		super("watchdog for " + victim.getName());
 		this.victim = victim;
 		this.timeout = timeout;
 	}
 	public Watchdog(long timeout) {
 		this(Thread.currentThread(), timeout);
 	}
 	public void run() {
 		try {
 			sleep(timeout);
 			victim.interrupt();
 		}
 		catch (InterruptedException e) {
 			// cancelled!
 		}
 	}
 	public void cancel() {
 		interrupt();
 	}
}
HttpClient client;
HttpMethod method
Watchdog w = new Watchdog(5000);
w.start();
client.executeMethod(method);
InputStream in = method.getResponseBodyAsStream();
SomeData data = SomeData.extract(in);
w.cancel();
return data;
However, there are a number of potential bugs in that code. Stefan, if 
you're looking to sharpen your threading skills, see if you can identify 
them, and suggest fixes! [1]
And by the way in such a case how can one cleanly terminate the thread 
that performs the read call likely to block everything ?
Generally, you call interrupt() on the thread, and hope for the best. If 
all the code that the thread is running handles InterruptedException 
properly (ie cleans up quickly and propagates the exception), then this 
will cause it to stop as quickly as possible.
However, there is a *huge* caveat to this, which is that interruption does 
not reliably break IO blocks across all platforms. I believe it works on 
Solaris, but there are some kinds of IO which are not interrupted on 
Windows, including socket IO. I know this was the case a few years ago; i 
have no idea if it's changed. I also have no idea what the situation is on 
other unixes.
Note that even if IO cannot be interrupted, the approach above using an 
Executor will still work, but won't stop the task thread. The watchdog 
approach just won't work at all.
The official workaround for this is that instead of interrupting the 
thread, you should close the socket it's blocking on, but there's no 
straightforward way to do that when it's wrapped in an HttpClient. I don't 
think there's a close method on HttpClient or HttpMethod; if you could get 
hold of the underlying HttpConnection, you could close that, but i don't 
see any way to do so. You could close the InputStream, but i don't know if 
that will work, and in any case, that won't break a block that happens in 
executeMethod or getResponseBodyAsStream. If there was something you could 
close, then the watchdog approach could easily be modified to close it. 
The executor approach would be harder - i think you'd have to write a new 
Executor subclass that used a new subclass of FutureTask which handled 
cancellation differently, by passing on a message to the callable to 
cancel itself. I've had a look at the implementation of FutureTask, and it 
looks like good old Doug Lea has made this a bit of a pain to do.
So, there is yet another thing you could do. Rather than going in from the 
front end, you sneak in round the back. You write a new HttpClient 
HttpConnectionManager which wraps another HttpConnectionManager instance 
and forwards all calls to it, except that it does some cleverness to allow 
you to deal with the hangs. The simplest thing it could do would be, in 
the getConnection methods, to store the returned HttpConnection in a 
thread-local variable. You could then set up a watchdog, as above, that 
instead of interrupting the main thread, would retrieve the HttpConnection 
from that thread-local and call close() on it. Alternatively, you could 
have the HttpConnectionmanager wrapper return a HttpConnection wrapper, 
that when getResponseInputStream() was called on it, returned an 
InputStream wrapper that did read timeouts (somehow).
An even better way (maybe) would be to write a java SocketFactory that 
created sockets with a read timeout, but i can't see any way to get that 
factory into HttpClient, or to make it a VM-wide defeault.
Hope this helps!
tom
[1] In case i forget them, the two that are obvious, in rot13:
Svefgyl, gur jngpuqbt guernq vfa'g thnenagrrq gb fgneg rkrphgvat 
vzzrqvngryl nsgre fgneg(), fb gurer pbhyq or na neovgenevyl ybat gvzr 
orgjrra orvat fgnegrq naq uvggvat gur fyrrc pnyy. Guhf, gur gvzr orgjrra 
gur znva guernq pnyyvat fgneg() naq orvat vagreehcgrq pbhyq or neovgenevyl 
terngre guna gur fcrpvsvrq gvzrbhg. Bs pbhefr, fyrrc() vfa'g thnenagrrq gb 
or cresrpgyl npphengr naljnl. V pna'g guvax bs n jnl gb cebcreyl fbyir 
guvf, ohg n fgvpxvat-cynfgre jbhyq or gb bireevqr Guernq.fgneg gb qb 
fhcre.fgneg() naq gura guvf.jnvg(), naq punatr eha gb qb n guvf.abgvsl() 
vzzrqvngryl orsber gur fyrrc pnyy (jvgu nccebcevngr flapuebavmngvba ba gur 
jnvg naq abgvsl pnyyf). Gung jbhyq zrna gung gur znva guernq jbhyq oybpx 
va fgneg() hagvy gur guernq unq npghnyyl fgnegrq, juvpu zvtug uryc. Be 
zvtug abg, fvapr gur znva guernq pbhyq gura jnvg na neovgenel nzbhag bs 
gvzr nsgre orvat abgvsvrq orsber orvat fpurqhyrq ntnva!
Frpbaqyl, gurer'f n enpr pbaqvgvba orgjrra n znva guernq pnapryyvat n 
jngpuqbt naq gur jngpuqbt guernq pbzvat bhg bs vgf fyrrc - vs gur 
vagreehcgvba neevirf nsgre fyrrc unf svavfurq ohg orsber 
ivpgvz.vagreehcg() vf pnyyrq, gur ivpgvz jvyy or vagreehcgrq rira gubhtu 
vg pnapryyrq gur jngpuqbt. V guvax gur fbyhgvba vf gb thneq obgu 
vagreehcg() pnyyf jvgu n grfg bs !Guernq.pheeragGuernq().vagreehcgrq(), 
jvgu gur grfg naq vgf thneqrq vagreehcgvba orvat jenccrq va n flapuebavmrq 
oybpx gb znxr gurz ngbzvp.
Abgr gung lbh ernyyl, ernyyl qba'g jnag gb or vagreehcgvat gur znva guernq 
nsgre vg'f pbzr bhg bs vgf oybpxvat jbex. Qbvat gung jvyy abg nssrpg vgf 
rkrphgvba vzzrqvngryl, ohg jvyy frg vgf vagreehcgrq synt, juvpu zrnaf gung 
gur arkg gvzr vg gevrf gb jnvg be qb oybpxvat VB, cbgragvnyyl fbzrjurer 
pbzcyrgryl qvssrerag va gur pbqr, vg jvyy trg na Vagreehcgrq[VB]Rkprcgvba 
vzzrqvngryl, juvpu jvyy or ernyyl pbashfvat naq uneq gb qroht. V'z cerggl 
fher gur frpbaq svk nobir nqqerffrf guvf: vs gur znva guernq unf orra 
vagreehcgrq nsgre svavfuvat vgf oybpxvat, gura gur pnyy gb vagreehcgrq() 
va pnapry() jvyy pyrne gur synt, naq v qba'g *guvax* gurer'f nal jnl gur 
jngpuqbt pna vagreehcg vg bapr vg'f pbzr bhg bs gung flapuebavmrq oybpx, 
orpnhfr gur jngpuqbt vgfrys jvyy unir orra vagreehcgrq, naq fb jvyy abg 
vagreehcg gur znva guernq. Evtug?
-- 
Mpreg is short for Male Impregnation and I cannot get enough. -- D