Re: suggestions for optimization loading of int array from disk
jonbbbb wrote:
Hello,
I was just wondering if there would be any suggestions for getting the
following scenario to run faster.
I have a program that loads some data from disk as a byte array.
This byte data is actually a quite large list of int that I want to
use.
So I first use read(byte[] b) to fill the byte array, then I fill the
int array by going through
the byte array and use some byte shifting to get 4 bytes to an int.
If this was a C program I could just read it as a byte array, and cast
it to a int array without
going through the painful loop of actually converting each int,
right?
Only if you're lucky in matters of alignment, endianness,
and so on.
I suppose there is no way around it in Java. Would it make sense to
write this as a C and
use JNI to get it back into Java. Any other ideas?
I/O time probably dominates, but making two copies (one
byte[] and one int[]) is irksome if they're large. Although I
haven't made any measurements (and I'm always berating others
for making performance claims without measuring), my *guess*
is that the expense of crossing the Java/JNI frontier would be
significant, and might well negate any gains from reducing the
memory footprint and GC workload.
Personally, I'd try to read a moderate-sized batch of bytes
and int-ize them before moving on to the next batch. Instead
of storing two huge arrays, you store one huge int[] and a much
smaller byte[], approximately halving the memory pressure.
"Philipp" suggests using DataInputStream, which is suitable
if the file stores its data in Java's chosen format. If the
bytes were produced by some other means, you'll need to do
the rearranging yourself.
--
Eric.Sosman@sun.com