Re: Hash table performance
Jon Harrop wrote:
Martin Gregorie wrote:
Did you try setting the initial capacity to 1000?
I just tried setting it to the final size of the hash table (10M) and the
time actually got slightly worse.
I've had terrible performance from C programs that malloced for every one
of a huge number of little items it wanted to put on the heap. I'd hope
any JVM was better than that but you never know.... Besides, its quite
possible NET allocates bigger chunks since Windows memory allocation was/
is quite slow.
I believe the JVM is doing orders of magnitude more allocations than .NET
here due to its type erasure approach to generics.
I don't think type erasure or generics have anything to do with it.
On a non-rehashing call, the JVM would do three allocations, one of
which, for the Entry, would have been required regardless of the types.
The other two are due the lack of primitive-based implementations of the
java.util collections, causing two autobox conversions from double to
Double.
The .NET implementation may have a different way of storing the hash
chains that avoids allocating an object for each put call.
In addition, there is the issue of the number of rehashes. Each Java
rehash call only involves one object allocation, but a lot of memory
access and computation.
That is, the issues I see as being candidates for the problem are the
non-support of primitive collections, and other issues in the design of
the F# and Java collections.
Patricia