Re: Patricia trie vs binary search.
 
On 5/27/12 6:44 PM, Gene Wirchenko wrote:
On Sat, 26 May 2012 17:30:17 -0700, Daniel Pitts
<newsgroup.nospam@virtualinfinity.net>  wrote:
[snip]
I tend to use a Deterministic Finite State Automata for this.  You can
load the entire English dictionary fairly easily with that scheme. Yes,
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
you use a bit of memory, but unless you're doing this on an embedded
device, its probably not enough memory to be concerned about.
      Including all affixes?
I suppose it depends on the particular dictionary, but we're only 
talking a few hundred thousand entries, at least with the Moby 
word-lists as a base:
cat compound-words.txt often-mispelled.txt english-most-frequent.txt 
male-names.txt female-names.txt common-names.txt common-dictionary.txt 
official-scrabble-* | sort | uniq | wc -lc
   388997 4801599
That's just over 4MB of strings, if stored as naively as possible. 
Certainly no problem for a modern day computer. It can actually be quite 
compressed when converted it to a DFSA, assuming reasonably optimized 
implementations.
[snip]
Sincerely,
Gene Wirchenko