Re: suggestions for optimization loading of int array from disk
On Thu, 23 Apr 2009, jonbbbb wrote:
I have a program that loads some data from disk as a byte array.
This byte data is actually a quite large list of int that I want to
use.
Is it just ints, or is it mixed in with other stuff?
So I first use read(byte[] b) to fill the byte array, then I fill the
int array by going through
the byte array and use some byte shifting to get 4 bytes to an int.
If this was a C program I could just read it as a byte array, and cast
it to a int array without going through the painful loop of actually
converting each int, right?
I suppose there is no way around it in Java. Would it make sense to
write this as a C and use JNI to get it back into Java.
Hell no.
Any other ideas?
Have a look in the java.nio package. There you will find a class called
ByteBuffer, which is a thing which holds a big load of bytes, and one
called IntBuffer, which does the same for ints. You will also find, in
java.nio.channels, some classes which can be used to read buffers from
disk; primarily FileChannel, but also Channels, which has a
newChannel(InputStream) method that you can use if you need to
interoperate with java.io streams.
If you now look again at ByteBuffer, you will see that it has a method
asIntBuffer, which makes an IntBuffer which is really a view on the
ByteBuffer - exactly like your evil cast in C.
Put all these bits together, and you have a clean, easy and safe way of
reading your file and getting access to it as ints.
Here's a little demo:
import java.nio.ByteBuffer;
import java.nio.IntBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.ReadableByteChannel;
import java.io.FileInputStream;
import java.io.IOException;
public class IntFile {
public static void main(String... args) throws IOException {
ReadableByteChannel chan = new FileInputStream(args[0]).getChannel();
ByteBuffer buf = ByteBuffer.allocate(1024 * 1024);
chan.read(buf);
buf.flip();
IntBuffer ibuf = buf.asIntBuffer();
while (ibuf.hasRemaining()) {
System.out.println(ibuf.get());
}
chan.close();
}
}
You can actually make it potentially even better than this, by using
FileChannel's map method, which memory-maps the file in as a buffer. That
avoids having to explicitly read it at all.
Mind you, doubt all of this is faster than just using
DataInputStream.readInt if you only need sequential access.
tom
--
unconstrained by any considerations of humanity or decency