Re: problems getting stderr and stdout from process object
On Mon, 28 Sep 2009, Daniel Pitts wrote:
Peter Duniho wrote:
On Thu, 24 Sep 2009 11:52:19 -0700, Daniel Pitts
Actually, what may be happening is that the process itself isn't finishing
because it is blocking on something else altogether. It is also possible
that the stderr buffer is full, so the process blocks until *it*
In general, you will need to read the stdout and stderr from separate
threads. This is a flaw in the API in my opinion.
I'm curious: how would you fix it?
You cannot buffer the output for either stdout or stderr indefinitely. It
would not even be practical or robust to buffer until a memory allocation
simply fails. So the question becomes, what to do when the buffer becomes
filled? You have two obvious choices: discard data, or block output until
room is made.
Do you see some other practical alternative?
The problem I have isn't with the blocking-on-buffer-fill behavior, but with
the fact that you *must* have two threads running to use this API
I can think of three alternatives off the top of my head:
1. Don't use two InputStream instances, but instead use a new kind of IO
class designed to handle interleaved data. It would allow better correlation
between events in each "stream", and it would allow you to read the streams
in the current thread, without spawning a new one.
Does available() work on the stdout from a child process? If so, i think
you could implement single-thread interleaved (or at least interleaveish)
IO on top of the current API.
2. Offer some sort of "select()" based waiting for the streams. This allows
one thread to handle multiple streams.
This would be really useful.
3. 1 and 2 combined.
What would also be useful would be a way to redirect the child's input or
output from or to a file (or /dev/null or its equivalent). You could then
run a process, feed it input, wait for it to finish, then read the output
file. Only one thread needed, and no IO weirdness.
It would be important for the select() to also support "OutputStream"
readiness, because you could otherwise end up with a deadlock (the
process is supposed to receive more input, but the output buffer is
full, so the whole thing might block)
As far as those two choices go, I'm quite pleased that the design choice
made was to block output, rather than to discard data.
I suppose that now, with NIO (which came well after Process), the API could
provide a SelectableChannel implementation, allowing a single thread to
process more than one stream. But, the main motivation for that NIO
feature is to avoid the creation of thousands of threads when you have that
many streams to deal with; a process is only going to have at most these
two output streams, so all the work to implement a SelectableChannel just
to avoid the creation of one extra thread seems like overkill to me.
Creating one thread is more than just run-time overhead. There is a
development cost with multi-threading. You are more prone to deadlocks,
synchronization problems, and much more, when you create a new Thread.
Yes, those problems can be avoided, but its *much* more to think about.
If all your thread is doing is pumping data into the child process, there
is nothing more to think about. The above paragraph looks mostly like FUD
to me - no offence, i just think this is a manifestation of the
superstitious fear of threads that is commonplace in the java world.
If a scientist were to cut his ear off, no one would take it as evidence
of heightened sensibility -- Peter Medawar