Re: Patricia trie vs binary search.
Gene Wirchenko wrote:
Daniel Pitts wrote:
[snip]
BTW, if I check the memory usage before loading words and after, the
difference is ~ 42MB
So, loading 481k words takes up about 42MB. This is in java, which has a
fairly high overhead per string. And the implementation of my data
structure is also fairly naive as well.
We've been talking apples and oranges. You refer here to the memory
footprint if the entire word list were in memory as naive Strings. I've been
talking about the footprint in storage (e.g., the SD card of a smartphone).
So, Gene, you have been misinterpreting my point.
There is little reason to hold the dictionary in memory at all times if you
are in a memory-constrained environment. If one does need to hold it
in memory, and one were worried about memory footprint, one would not
hold it in a memory-intensive, naive structure, in RAM all at once.
Compression exceeding 90% is typical with text. Assuming 75% compression,
a very reasonable and safe estimate, the data would occupy about 10 MB, just
as I said. That's if you keep it in memory in compressed form.
A disk-based solution would likely occupy only a few KB at a time.
Nice try to distort the issue into a scary monster, though.
--
Lew