Re: Hash table performance
On Sat, 21 Nov 2009, markspace wrote:
Jon Harrop wrote:
My guess is that this is because the JVM is boxing every floating point
number individually in the hash table due to type erasure whereas .NET
creates a specialized data structure specifically for a float->float hash
table with the floats unboxed. Consequently, the JVM is doing enormously
numbers of allocations whereas .NET is not.
3. Autoboxing appears to have an impact but it's less than either #1 or #2.
Lastly, removing the autoboxing by writing a specialized class saved
about 0.7 second from just using HashMap. The specialized version time
was 5.3 seconds. Note that I allocated a new object myself -- the
"Entry" for the HashMap -- each time a new double is added. I suspect
that any reasonable C# hash object must do the same, so you're not
losing as much time as you think by Java's auto object creation.
Hang on, if i've read your code right, then what you've done is replaced
automatic boxing with manual boxing. The point is that in .NET, there is
*no* boxing in this case - doubles are stored unwrapped in the map.
Although in your code, you put both doubles in a single box, so if boxing
is a slowdown, your code should roughly halve it.
I took the liberty of converting java's HashMap to work directly with
primitive doubles:
http://urchin.earth.li/~twic/Code/DoubleMap.java
I haven't tested that this implementation is correct, but on the benchmark
i posted a little earlier, it comes out 17% faster than a HashMap with
boxing.
So, 7.5% for synchronization, 17% for boxing - we're still a good way off
this reported 32x!
tom
<code>
package cljp;
import java.util.HashMap;
import java.util.Hashtable;
import static java.lang.System.out;
public class HashTest
{
public static void main( String[] args )
{
Hashtbl.test();
HashM.test();
HashTest.test();
}
HashMap<Entry,Entry> hash = new HashMap<Entry,Entry>();
private static class Entry {
final double key;
final double value;
public Entry( double key, double value )
{
this.key = key;
this.value = value;
}
@Override
public int hashCode()
{
long bits = Double.doubleToLongBits( key );
bits ^= bits >>> 32;
return (int)bits;
}
@Override
public boolean equals( Object obj )
{
if( !(obj instanceof Entry ) ){
return false;
}
return key == ((Entry)obj).key;
}
}
public void put( double key, double value ) {
Entry e = new Entry(key, value );
hash.put( e, e );
}
public double get( double key ) {
Entry e = new Entry( key, 0.0 );
Entry valueEntry = hash.get( e );
if( valueEntry != null ) {
return valueEntry.value;
} else {
throw new IllegalArgumentException("Not found: "+key);
}
}
public static void test()
{
long start = System.nanoTime();
HashTest hashTest = new HashTest();
for( int i = 1; i <= 10000000; ++i ) {
double x = i;
hashTest.put( x, 1.0 / x );
}
long end = System.nanoTime();
out.println( "HashTest time: "+ (end-start)/1000000 );
out.println( "hashtable(100.0) = " +
hashTest.get( 100.0 ) );
}
}
class HashM
{
public static void test()
{
long start = System.nanoTime();
HashMap<Double,Double> hashM = new HashMap<Double, Double>();
for( int i = 1; i <= 10000000; ++i ) {
double x = i;
hashM.put( x, 1.0 / x );
}
long end = System.nanoTime();
out.println( "HashMap time: "+ (end-start)/1000000 );
out.println( "hashtable(100.0) = " +
hashM.get( 100.0 ) );
}
}
class Hashtbl
{
public static void test()
{
long start = System.nanoTime();
Hashtable hashtable = new Hashtable();
for( int i = 1; i <= 10000000; ++i ) {
double x = i;
hashtable.put( x, 1.0 / x );
}
long end = System.nanoTime();
out.println( "HashTable time: "+ (end-start)/1000000 );
out.println( "hashtable(100.0) = " +
hashtable.get( 100.0 ) );
}
}
</code>
--
NO REAL THAN YOU ARE -- Ego Leonard, The Zandvoort Man