Re: Get performance statistics?
Robert Klemme wrote:
On 27.11.2006 17:40, Patricia Shanahan wrote:
Robert Klemme wrote:
On 27.11.2006 17:06, Patricia Shanahan wrote:
Robert Klemme wrote:
On 27.11.2006 08:17, Daniel Pitts wrote:
Patricia Shanahan wrote:
I would like to collect, inside a Java application, statistics
such as
the amount of CPU time used. Any idea how?
I can, of course, measure the elapsed time, but that does not
tell me
how much time was spent actually computing vs. waiting for disk.
Patricia
A quick googling leads me to believe you might need to use JNI (and
therefore have a platform-specific solution)
<http://www.google.com/search?q=java+system+monitoring>
Example for CPU on Win32
<http://www.javaworld.com/javaworld/javaqa/2002-11/01-qa-1108-cpu.html>
JVMTI might also help:
http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/index.html
http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html#timers
A low level solution would be to create a TimedInputStream and
TimedOutputStream which measure and sum up time spend in write()
and read(). You could then substract that from wall clock for this
thread. If you want to get more fancy those streams could register
themselves with some thread global counter so you automatically get
all IO timings if you make sure every stream is replaced (not easy
though with 3rd party libs like JDBC drivers). It depends on what
you actually want to measure and to what level of detail.
JVMTI looks interesting. The disk accesses that I'm worried about are
due to paging, not explicit requests, but JVMTI does have CPU time
collection.
If it is just for a one time debug (i.e. not necessarily part of a
product, I am not 100% sure from what you wrote) you could use OS
specific tools. On Windows that should be fairly easy with PerfMon
and on Linux you can use iostat, vmstat and relatives.
The lack of clarity about debug vs product is inherent in the nature of
the application. It is part of a CS research project. Understanding the
behavior of the program is part of the product.
That sounds interesting! Are you allowed to disclose more detail?
I'm doing research in ubiquitous computing. My adviser is Bill Griswold
- you can see some of the sort of work by looking at his home page,
http://www.cs.ucsd.edu/~wgg/
My particular line is applying machine learning to ubiquitous computing.
Machine learning algorithms can easily get into time and/or space
trouble.
My immediate problem is whether runs that take 24 hours do so
because they are thrashing, or because of sheer CPU time. It is harder
than it sounds, because the largest jobs only run on a grid computer
where I have limited access to the compute elements. However, the
performance data is something I should collect so that I can put some
statistics in papers.
Each job reads a small XML parameter file describing a simulation, sets
up and runs it, and outputs a very slightly larger XML file containing
the results. Because of the use of XML, I can add e.g. performance data
to the output file without disturbing my output analysis programs.
Yes, there are basically two alternatives. Ideally, I would like the
data to appear in the output file, so that it is packaged with the rest
of the information about the run. However, I may go outside.
Ah, I see.
Either way, I'm afraid it is going to be less convenient than my current
lifestyle - one makefile to control the runs, one Jar file to contain my
program, and it all works on my home system, works on my university
desktop, and runs dozens of jobs in parallel on a large grid computer.
Oh, you are using "make"? I am so glad that I did not have to touch
"make" for years now. "ant" is a really great alternative when in the
Java world.
I believe there is a generic way to launch a shared lib via the java
command line. As long as you create that for all platforms involved and
make sure it's installed you could still get away with the single make.
I never closely looked at Java WebStart but I figure it might contain
features to also install binary extensions - might be worth a look.
I have no ability to install software on the grid I use for bulk
runs. It is shared and has its own administration team. I install cygwin
on the windows machines I do control (my home desktop, desktop at UCSD,
laptop, and tablet). The grid has a grid-aware make, qmake, installed.
I have an EXTREMELY simple makefile, so the usual issues don't arise. It
is just a portable way of managing runs. Depending on which command and
command line parameters I use, it can do one job at a time, or 50.
Patricia