Re: Does object pooling *ever* make sense?
On 10 f=E9v, 10:24, "pascal.lecoi...@euriware.fr"
<pascal.lecoi...@euriware.fr> wrote:
On 10 f=E9v, 09:32, "pascal.lecoi...@euriware.fr"
<pascal.lecoi...@euriware.fr> wrote:
On 10 f=E9v, 02:26, Chris <spam_me_...@goaway.com> wrote:
looks like object pooling is a necessity when speed is important a=
nd
objects get larger than a few dozen Kb in size.
That's far too simplistic a statement. It may be true where your u=
sage
pattern is that you allocate a huge buffer, don't care what's initi=
ally in it,
only write to the first byte, never read from it, don't ever need d=
ifferent
ones at the same time, and are running on a specific platform.
Here's new benchmark:
I created an array of 1000 128K buffers. This is enough to ensure that
most of them are in main memory, not the processor cache.
I ran Arrays.fill(buffer, (byte)0) on the list 100 times, for a total=
of
100,000 calls.
The elapsed time? 40 seconds, almost exactly the amount of time requi=
red
to allocate 100,000 128K buffers in the previous benchmark. This
indicates that zeroing the arrays probably does take up most of the
time when allocating new objects.
This does support my claim that object pooling is a necessity when us=
ing
large objects, *except* in cases where zeroing is necessary.
Which raises a question: wouldn't it be a good idea to have a way to
tell Java, in special cases, not to zero out allocated objects?
And it make the case of pooling only useful for primitive array. For
reference arrays, you *must* zero the array when you pull it back in
the pool, otherwise the GC cannot collect the objects referenced by
the array (and bingo the memory leak :)).
And between different primitives, perhaps the time to zero is not the
same (problems of alignment to 32bits or 64bits), perhaps the time to
zero the same array of integer is less than with a byte array
So, it's useful to pool :
- when it's an array of primitives
- and when you don't have to zero the array when you take it from the
pool
And there is the problem of synchronization of the pool... to compare
you must execute the two options
I'll be back with a complete test :)
<scce>
/**
* Test of the utility of pool for primitive arrays
*/
public class TestPool {
public static final int ITERATIONS = 100000;
public static final int NB_INCREMENTS = 19;
public static void main(String[] args) {
testByteWithPool();
testByteWithoutPool();
testIntWithPool();
testIntWithoutPool();
}
/** An object to lock the pool when we use it */
private static Object poolLock = new Object();
private static boolean poolUsed = false;
private static byte[] bytePool;
private static int[] intPool;
private static Object[] refPool;
private static void testByteWithPool() {
int currentSize = 1;
for (int j = 0; j < NB_INCREMENTS; j++) {
long t1 = System.nanoTime();
// Creation of the pool
synchronized (poolLock) {
bytePool = new byte[currentSize];
poolUsed = false;
}
for (int i = 0; i < ITERATIONS; i++) {
byte[] array = null;
// Get the array from pool
synchronized (poolLock) {
if (poolUsed) {
throw new RuntimeExceptio=
n("pool already in use");
}
array = bytePool;
poolUsed = true;
}
// use the array
array[0] = 1;
// put back the array in the pool
synchronized (poolLock) {
poolUsed = false;
}
}
long t2 = System.nanoTime();
System.out.printf("BYTE POOL : bufsize %d elapsed=
= %d\n",
currentSize, (t2-t1)/1000000);
currentSize *= 2;
}
}
private static void testByteWithoutPool() {
int currentSize = 1;
for (int j = 0; j < NB_INCREMENTS; j++) {
long t1 = System.nanoTime();
for (int i = 0; i < ITERATIONS; i++) {
// Get the array
byte[] array = new byte[currentSize];
// use the array
array[0] = 1;
}
long t2 = System.nanoTime();
System.out.printf("BYTE NOPOOL : bufsize %d elaps=
ed = %d\n",
currentSize, (t2-t1)/1000000);
currentSize *= 2;
}
}
private static void testIntWithPool() {
int currentSize = 1;
for (int j = 0; j < NB_INCREMENTS; j++) {
long t1 = System.nanoTime();
// Creation of the pool
synchronized (poolLock) {
intPool = new int[currentSize];
poolUsed = false;
}
for (int i = 0; i < ITERATIONS; i++) {
int[] array = null;
// Get the array from pool
synchronized (poolLock) {
if (poolUsed) {
throw new RuntimeExceptio=
n("pool already in use");
}
array = intPool;
poolUsed = true;
}
// use the array
array[0] = 1;
// put back the array in the pool
synchronized (poolLock) {
poolUsed = false;
}
}
long t2 = System.nanoTime();
System.out.printf("INT POOL : bufsize %d elapsed =
= %d\n",
currentSize, (t2-t1)/1000000);
currentSize *= 2;
}
}
private static void testIntWithoutPool() {
int currentSize = 1;
for (int j = 0; j < NB_INCREMENTS; j++) {
long t1 = System.nanoTime();
for (int i = 0; i < ITERATIONS; i++) {
// Get the array
int[] array = new int[currentSize];
// use the array
array[0] = 1;
}
long t2 = System.nanoTime();
System.out.printf("INT NOPOOL : bufsize %d elapse=
d = %d\n",
currentSize, (t2-t1)/1000000);
currentSize *= 2;
}
}
}
</scce>
And the results are :
BYTE POOL : bufsize 1 elapsed = 8
BYTE POOL : bufsize 2 elapsed = 4
BYTE POOL : bufsize 4 elapsed = 4
BYTE POOL : bufsize 8 elapsed = 4
BYTE POOL : bufsize 16 elapsed = 3
BYTE POOL : bufsize 32 elapsed = 4
BYTE POOL : bufsize 64 elapsed = 4
BYTE POOL : bufsize 128 elapsed = 4
BYTE POOL : bufsize 256 elapsed = 1
BYTE POOL : bufsize 512 elapsed = 6
BYTE POOL : bufsize 1024 elapsed = 4
BYTE POOL : bufsize 2048 elapsed = 3
BYTE POOL : bufsize 4096 elapsed = 2
BYTE POOL : bufsize 8192 elapsed = 4
BYTE POOL : bufsize 16384 elapsed = 6
BYTE POOL : bufsize 32768 elapsed = 2
BYTE POOL : bufsize 65536 elapsed = 6
BYTE POOL : bufsize 131072 elapsed = 4
BYTE POOL : bufsize 262144 elapsed = 4
BYTE NOPOOL : bufsize 1 elapsed = 3
BYTE NOPOOL : bufsize 2 elapsed = 1
BYTE NOPOOL : bufsize 4 elapsed = 1
BYTE NOPOOL : bufsize 8 elapsed = 4
BYTE NOPOOL : bufsize 16 elapsed = 3
BYTE NOPOOL : bufsize 32 elapsed = 4
BYTE NOPOOL : bufsize 64 elapsed = 8
BYTE NOPOOL : bufsize 128 elapsed = 19
BYTE NOPOOL : bufsize 256 elapsed = 19
BYTE NOPOOL : bufsize 512 elapsed = 49
BYTE NOPOOL : bufsize 1024 elapsed = 80
BYTE NOPOOL : bufsize 2048 elapsed = 162
BYTE NOPOOL : bufsize 4096 elapsed = 308
BYTE NOPOOL : bufsize 8192 elapsed = 651
BYTE NOPOOL : bufsize 16384 elapsed = 1210
BYTE NOPOOL : bufsize 32768 elapsed = 2389
BYTE NOPOOL : bufsize 65536 elapsed = 4750
BYTE NOPOOL : bufsize 131072 elapsed = 9634
BYTE NOPOOL : bufsize 262144 elapsed = 19496
INT POOL : bufsize 1 elapsed = 6
INT POOL : bufsize 2 elapsed = 4
INT POOL : bufsize 4 elapsed = 4
INT POOL : bufsize 8 elapsed = 4
INT POOL : bufsize 16 elapsed = 6
INT POOL : bufsize 32 elapsed = 4
INT POOL : bufsize 64 elapsed = 4
INT POOL : bufsize 128 elapsed = 2
INT POOL : bufsize 256 elapsed = 4
INT POOL : bufsize 512 elapsed = 4
INT POOL : bufsize 1024 elapsed = 4
INT POOL : bufsize 2048 elapsed = 4
INT POOL : bufsize 4096 elapsed = 4
INT POOL : bufsize 8192 elapsed = 7
INT POOL : bufsize 16384 elapsed = 4
INT POOL : bufsize 32768 elapsed = 4
INT POOL : bufsize 65536 elapsed = 2
INT POOL : bufsize 131072 elapsed = 4
INT POOL : bufsize 262144 elapsed = 4
INT NOPOOL : bufsize 1 elapsed = 5
INT NOPOOL : bufsize 2 elapsed = 5
INT NOPOOL : bufsize 4 elapsed = 0
INT NOPOOL : bufsize 8 elapsed = 4
INT NOPOOL : bufsize 16 elapsed = 5
INT NOPOOL : bufsize 32 elapsed = 11
INT NOPOOL : bufsize 64 elapsed = 25
INT NOPOOL : bufsize 128 elapsed = 42
INT NOPOOL : bufsize 256 elapsed = 80
INT NOPOOL : bufsize 512 elapsed = 161
INT NOPOOL : bufsize 1024 elapsed = 306
INT NOPOOL : bufsize 2048 elapsed = 604
INT NOPOOL : bufsize 4096 elapsed = 1180
INT NOPOOL : bufsize 8192 elapsed = 2394
INT NOPOOL : bufsize 16384 elapsed = 4763
INT NOPOOL : bufsize 32768 elapsed = 9541
INT NOPOOL : bufsize 65536 elapsed = 18997
INT NOPOOL : bufsize 131072 elapsed = 38034
INT NOPOOL : bufsize 262144 elapsed = 76232
So we can conclude :
- using the pool for a primitive array is constant (but my pool is
very light, one object, the overhead of a real pool will be much
important)
- zeroing the array of int is four times zeroing the array of byte
- the time is very short, for an array of 256k, creating and
initializing an array if about 0.19 ms
so it really depends if the creation of array is really very intense
in your program
as always, profile first, optimize after :)
I did the same test with zeroing the array when i take it from the
pool.
The results are :
BYTE POOL : bufsize 1 elapsed = 10
BYTE POOL : bufsize 2 elapsed = 9
BYTE POOL : bufsize 4 elapsed = 7
BYTE POOL : bufsize 8 elapsed = 8
BYTE POOL : bufsize 16 elapsed = 11
BYTE POOL : bufsize 32 elapsed = 13
BYTE POOL : bufsize 64 elapsed = 32
BYTE POOL : bufsize 128 elapsed = 45
BYTE POOL : bufsize 256 elapsed = 83
BYTE POOL : bufsize 512 elapsed = 158
BYTE POOL : bufsize 1024 elapsed = 314
BYTE POOL : bufsize 2048 elapsed = 620
BYTE POOL : bufsize 4096 elapsed = 1230
BYTE POOL : bufsize 8192 elapsed = 2470
BYTE POOL : bufsize 16384 elapsed = 4924
BYTE NOPOOL : bufsize 1 elapsed = 7
BYTE NOPOOL : bufsize 2 elapsed = 2
BYTE NOPOOL : bufsize 4 elapsed = 0
BYTE NOPOOL : bufsize 8 elapsed = 2
BYTE NOPOOL : bufsize 16 elapsed = 4
BYTE NOPOOL : bufsize 32 elapsed = 4
BYTE NOPOOL : bufsize 64 elapsed = 7
BYTE NOPOOL : bufsize 128 elapsed = 20
BYTE NOPOOL : bufsize 256 elapsed = 25
BYTE NOPOOL : bufsize 512 elapsed = 39
BYTE NOPOOL : bufsize 1024 elapsed = 82
BYTE NOPOOL : bufsize 2048 elapsed = 161
BYTE NOPOOL : bufsize 4096 elapsed = 308
BYTE NOPOOL : bufsize 8192 elapsed = 602
BYTE NOPOOL : bufsize 16384 elapsed = 1192
INT POOL : bufsize 1 elapsed = 9
INT POOL : bufsize 2 elapsed = 6
INT POOL : bufsize 4 elapsed = 6
INT POOL : bufsize 8 elapsed = 10
INT POOL : bufsize 16 elapsed = 13
INT POOL : bufsize 32 elapsed = 13
INT POOL : bufsize 64 elapsed = 21
INT POOL : bufsize 128 elapsed = 45
INT POOL : bufsize 256 elapsed = 84
INT POOL : bufsize 512 elapsed = 161
INT POOL : bufsize 1024 elapsed = 320
INT POOL : bufsize 2048 elapsed = 618
INT POOL : bufsize 4096 elapsed = 1252
INT POOL : bufsize 8192 elapsed = 2478
INT POOL : bufsize 16384 elapsed = 4953
INT NOPOOL : bufsize 1 elapsed = 5
INT NOPOOL : bufsize 2 elapsed = 1
INT NOPOOL : bufsize 4 elapsed = 2
INT NOPOOL : bufsize 8 elapsed = 6
INT NOPOOL : bufsize 16 elapsed = 6
INT NOPOOL : bufsize 32 elapsed = 7
INT NOPOOL : bufsize 64 elapsed = 24
INT NOPOOL : bufsize 128 elapsed = 39
INT NOPOOL : bufsize 256 elapsed = 82
INT NOPOOL : bufsize 512 elapsed = 152
INT NOPOOL : bufsize 1024 elapsed = 302
INT NOPOOL : bufsize 2048 elapsed = 617
INT NOPOOL : bufsize 4096 elapsed = 1201
INT NOPOOL : bufsize 8192 elapsed = 2399
INT NOPOOL : bufsize 16384 elapsed = 4790
The "manual" zeroing of the array of byte is four times the self
zeroing of the array (by the operator new)