Re: hashCode() for Custom classes
Patricia Shanahan <pats@acm.org> writes:
More generally, it seems unlikely to me that hash codes are
uniformly distributed over all the int values.
There are (at least) two points of view:
The conceptual point of view merely defines the requirements
for a hash function. It does not refer to a specific programming
language, specific classes or makes special assertions about
any specific hash value.
Another point of view can study how a hash function is
implemented by the operation
http://download.java.net/jdk7/docs/api/java/lang/Object.html#hashCode()
in java.lang.Object and in classes extending java.lang.Object
of the Java SE standard library.
The second point of view might find
values close to zero to be overrepresented
, using a certain set of programs to obtain the frequency data.
A similar phenomenon in the real world is
http://en.wikipedia.org/wiki/Benford's_law
Usually, one will prefer a hash function with an even
distribution of hash values for a typical distribution
of data values or a typical set of objects.
There should not be a bias regarding special values.
So if one already has a good distribution, supressing
certain values that are deemed ?magical value? will
make the distribution worse, like a good random number
generator would be made worse if someone adds code to
supress 0 as a result, because he does not deem 0 ?to
be random?.
Funny enough, some people consider 17 to be the most
random number.
http://consc.net/notes/pick-a-number.html
http://google.to/search?q=cache:web.media.mit.edu/~guy/blog/entry.php%3F24110401
Of course, ?randomness? is not a property of a single number,
but of a distribution, and there are no ?good hash values?
or ?bad has values? - only ?good hash functions? and
?bad hash functions?.