Re: Can I compare references (in a sense of compareTo method)?
Roedy Green wrote:
On Fri, 21 Sep 2007 07:26:42 -0000, chucky <tomas.mikula@gmail.com>
wrote, quoted or indirectly quoted someone who said :
I don't care about the actual ordering, but I wanted to use Tree*
since I thought it would have smaller memory overhead than Hash*
I think you need to do an experiment to verify that. I
suspect the opposite since you would need a node per element in both,
with the TreeSet having in addition the tree structure, where the
HashSet has a simple array of pointers.
It seems obvious to me that a HashSet should be FASTER than a TreeSet.
You go right to the element with a single division or mask, rather
than chasing down a multi-level index structure.
People are concerned with the corner case, wherein the hash distributes the
(presumably random) input so imperfectly that linear search time at the
appropriate hash bucket exceeds the log(n) performance of the tree search.
The argument seems to be that TreeFoo's guarantee of O(log(n)) search
complexity pays off since HashFoo would degrade to O(n) in that case.
Either the input is sufficiently random that the odds of this happening are
acceptable, given a good general-purpose hash, or there are expectations that
would cause a general hash to distribute values poorly, in which case one
should override hashCode() with a better choice of algorithm for the expected
input shape.
TreeFoo suffers insertion performance degradation compared to HashFoo, too,
doesn't it? So not only would the effort of providing and maintaining a
Comparator have to exceed that of writing a good hashCode(), and the odds that
the input would have the shape to trigger the scenario exceed the
worthwhileness threshold, but the Foo would have to be frequently read, rarely
written, to make the Tree version perform better than the Hash version.
Given the fragility of semantically superfluous code introduced for putative
performance gains absent hard evidence of such gain, the simplicity of an
alternative solution that is likely to handle the issue, and the so-far high
uncertainty that the nature of the data will even remotely justify the
attention, I recommend avoiding TreeFoo for this particular matter.
--
Lew