Re: Slightly off-topic: Determining the strength of "Hangman" word.
On 5/31/12 10:43 AM, Roedy Green wrote:
On Thu, 31 May 2012 09:26:11 -0700, Daniel Pitts
<newsgroup.nospam@virtualinfinity.net> wrote, quoted or indirectly
quoted someone who said :
Oh, and to tie this into a previous thread, the whole thing fits in
memory with room to spare ;-)
Here are three ideas:
What you need to do is collect a giant hunk of random text from
various websites and compute a word frequency for each word in your
bag. The lower the frequency, the harder it is to guess.
Perhaps, though low usage-frequency doesn't mean it isn't easy to guess.
For example "exterminate" isn't a very common word, but a person would
probably get to "e?ter?in?te" with very will problems, and from there,
exterminate is the only possibility. "mixt" is probably a more difficult
word to guess. The progression is likely to be "?i??", (guess s,e,l,n,t)
"?i?t", (guess r,f). At this point, the game would be over. There would
be two possibilities left, "mixt" and "dipt". People don't seem to
think of "xt", so they might be more likely to guess "dipt".
This is of course, assuming they guess in the "highest letter frequency"
order.
Another measure would to compare your word with every other word in
the bag. The more letters it has in common in the corresponding slot
the harder it would be guess.
Look for rare letter pairs. These are EASY to guess.
The problem is that some words are false positives. They seem ambiguous
for the most part, but one letter absolutely gives it away.