Re: hashCode

From:

Daniel Pitts <newsgroup.nospam@virtualinfinity.net>

Newsgroups:

comp.lang.java.programmer

Date:

Wed, 29 Aug 2012 13:40:47 -0700

Message-ID:

<k7v%r.11010$6q.8748@newsfe13.iad>

On 8/29/12 11:49 AM, Eric Sosman wrote:

On 8/29/2012 2:06 PM, Daniel Pitts wrote:

On 8/28/12 5:02 PM, markspace wrote:

On 8/28/2012 4:33 PM, Daniel Pitts wrote:

interface Hasher<Type> {
int hash(Type t);

Not really seeing how this is a good idea. How would you implement
this?

So, that would change HashMap to take a Hasher<? super K> instance in
its constructor.

This is the problem; Map (and HashMap) were desired to be spec'd as
taking Object, not a subclass.

Actually, they are Generic, so they are not spec'd to take Object, but
to take a specific subtype defined at compile time. At least, now that
they have the addition of Generics. Pre-generics, they still had
Comparators which had the same behavior that I'm describing, but instead
of defining buckets, they define an ordering. See below.

Actually, "take" is insufficiently specific. A Map<K,V>
has a put() method with K,V parameters, and a putAll() method
with a Map<? extends K, ? extends V> parameter. To that extent,
Map<K,V> "takes" K.

But Map<K,V> *also* has get() and remove() and containsKey()
methods with Object parameters, not K parameters. (It also has a
containsValue() method taking Object, not V.) So insofar as
these methods are concerned, Map<K,V> "takes" Object.

Those are very good points. I had forgotten about the fact that get()
takes Object. The solution I come up with off the top of my head is to
update get/remove/etc... to all take the appropriate type (K or V). Yes,
this reduces some "functionality" of the class, but I doubt 99% of
programmers would care more than 1% of the time. I have used this
"feature", but only because proper typing wasn't done on an existing
legacy project.

Keep in mind, I'm talking about what Map *could* have been, not what it
could become.

A default Hasher<Object> could be implemented to use
System.identityHashCode and == for the common use-case.

Again not seeing how you'd actually use that to put an object in a Map.

Example usage:
++++
// MyKeyHasher implements Hasher<MyKey>
Map<MyKey, MyValue> map=new HashMap<MyKey,MyValue>(new MyKeyHasher());

map.put(myFirstKey, myFirstValue);
map.put(mySecondKey, mySecondValue);
++++

     This could sort of work, but not very well. As I wrote some
<hunt, hunt, hunt, ah!> seventeen days ago in this thread:

     I don't think a HashCalculator interface along
     the lines of Comparable would save the day.

The difficulty is that an external Hasher would have no access to
private fields of MyKey. That may seem a small drawback, since it
is rare to have a contributor to an object's "value" that is not
at the very least accessible through a getter. The Hasher might
need to make method calls where today's built-in hashCode() just
makes field references, but -- hey, how bad could that be?

     IMHO, it could be pretty bad. Take java.lang.String, for
example: As things stand today hashCode() inspects the "value"
of the String, and everything it uses would be accessible to a
Hasher<String>. But hashCode() then caches the computed value
in a private field within String to avoid recomputing it on every
subsequent call! Could Hasher<String> do the same? How?

Nothing would prevent String from having a hashCode method which behaves
exactly as it does today. The StringHasher class would simply delegate
to the String.hashCode() method, which allows String to cache as expected.

Okay, so the implementor of String perceives the problem and
decides that String itself will provide a default Hasher<String>
implementation. (This might be a static nested class, but it'd
probably be more efficient to have String implement Hasher<String>
directly, so every String is its own Hasher.) And the implementor
of BigInteger does the same, and so does the implementor of URL,
and of File, and of -- Hey, wait a minute! We're right back where
we began, except with more overhead and more verbiage!

Oh no, dreaded letters in my source code. I must reduce it down to as
little as possible, even if it means having one class be responsible for
15 things.

Java is verbose, but that shouldn't determine what is good design and
what isn't. Also, as I've suggested elsethread, having a Hashable
interface (similar to Comparable) would allow Objects to define a
sensible default comparison/hashing algorithm *specific to that class*.

A class with subclasses has no sensible algorithm (unless it takes into
account the actual type before comparison). This is the use-case where
Hasher makes the most sense. The user of the objects of the class can
specify what they care about in the equality of two objects, even if
those objects are of different specific types.