Re: Pseudo-random numbers and hash code (Was: Using a lot of Maps)
markspace wrote:
@Override
public int hashCode() {
if( hash == 0 ) {
int x = 0;
for( Object o : hierarchy ) {
x = x * 53 + (o==null? 0 : o.hashCode());
}
hash = x;
}
return hash;
}
Lew wrote:
I learned '31' as the coefficient to use for 32-bit hashes, though,
back in my Knuth-reading days. Is '53' better?
markspace wrote:
When NetBeans auto-generates hashCode() methods, I've noticed it
randomizes the constants it uses, around a small range of values. I
don't think it matters much, but I'm just doing the same thing. Any
constant with some low order bits set as well as some bits set above bit
position 3 will do fine I think.
31, 33, 35, 51, 53, 55, etc. all work pretty much the same, and generate
slightly different hash codes, meaning the chance of collision is
reduced if you use different constants once in a while.
I'm too lazy to do the research right now, but I'm going to delve into it
later. Here's what I recall:
- There's a difference between coefficient choices for the pseudorandom
generator (PRG) wrt cycle length - how long the pseudorandom sequence goes
before repeating.
- 31 gives a good cycle length.
- I think I recall 53 also being among the small selection of pretty decent
coefficients long-cycle-wise for 32-bit PRGs.
- I don't recall the choice being as wide as your post conveys.
- Cycle length is not the only measure of PRGs for hash goodness.
- There are ways to reduce the likelihood of hash collision for a given set of
values regardless of the goodness of the pseudorandom sequence, and the Java
API uses them.
Thus I conclude that your advice is sound, though by preference I will use a
prime coefficient, and that will be 31 for 32-bit hashes at least until I've
done the research again.
--
Lew
I'm thinking of launching a MMOPRG.