Re: Hash Code
On Tue, 20 May 2008, j1mb0jay wrote:
Would it make sense to only use the secure hash when the table size
exceeds the max value for an integer (2^32) and then at that point
rather than using an array use a specially implemented doubly linked
list.
LOL. If you're dealing with >2^32 items, a linked list would be a really,
really, *really* bad way to do it.
And a doubly linked list would be truly pessimal.
Although after a point you would have to start using B-Trees as it would
not all fit in memory at anyone time.
Would this allow fast search access for huge amounts of data, or would
it be slow and nasty because of the expensiveness of Hash
With that many elements, and the possible need to do IO to fetch things,
the time taken by the hash becomes insignificant. So you're alright there.
and the greater time complexity required access time for each element in
the list (compared to the array).
Yes, that pretty much ruins you. Finding a specific index in the linked
list would involve *billions* of cycles, and reading every location in
memory!
Unless it was very, very specially implemented.
I currently have a the hash table implemented and will add the logging
as requested. Although i do already know that after inserting 6 million
unique words into a hash table with a size of 10 million the longest
link list contained 13 items. Which means that the time complexity for a
search was O(13) from 6 million items.
Which is not so bad!
I understand that for the hash tables with a size of less that 2^32 that
the normal 32bit hash value is more than enough, i was just wondering
for hash tables that are larger.
When you have a hash table with >2^32 objects, choosing a hash function is
probably not your major problem.
If you do need a 64-bit hash, it's trivial to implement Bernstein's hash
with long instead of int, and it'll work just as well. If you need a
hashtable that holds >2^64 items, well, get back to us when you do.
tom
--
see im down wid yo sci fi crew