Re: How to check variables for uniqueness ?
Lew skrev:
(Please do not embed TAB characters in newsgroup postings.)
You could use a HashMap if you wanted to know how many times each word occurred:
snip
- Lew
Indeed.
And in case anyone's interested, here are the times for HashMap. Looks
like Map is in the league of Set, and not the slow-moving List. (These
times are longer than the previous times because of current CPU
loading; relativity is the key.)
522393 duplicated words. Using java.util.HashSet, time = 789ms.
522393 duplicated words. Using java.util.TreeSet, time = 2168ms.
522393 duplicated words. Using Map , time = 1180ms.
522393 duplicated words. Using java.util.ArrayList, time = 183795ms.
522393 duplicated words. Using java.util.LinkedList, time = 274781ms.
Apologies to Patricia: I see I mis-attributed her post, yet again. And
Lew, I've now become fast friends now with Linux's expand(). Let's see
whether I purged those nasty TABs:
import java.util.*;
import java.io.*;
class Test {
private static String TEXT_BOOK_NAME = "war-and-peace.txt";
public static void main(String[] args) {
try {
String text = readText(); // Read text into RAM
countDuplicateWords(text, new HashSet());
countDuplicateWords(text, new TreeSet());
countDuplicateWordsMap(text);
countDuplicateWords(text, new ArrayList());
countDuplicateWords(text, new LinkedList());
} catch (Throwable t) {
System.out.println(t.toString());
}
}
private static String readText() throws Throwable {
BufferedReader reader =
new BufferedReader(new FileReader(TEXT_BOOK_NAME));
String line = null;
StringBuffer text = new StringBuffer();
while ((line = reader.readLine()) != null) {
text.append(line + " ");
}
return text.toString();
}
private static void countDuplicateWords(String text,
Collection listOfWords) {
int numDuplicatedWords = 0;
long startTime = System.currentTimeMillis();
for (StringTokenizer i = new StringTokenizer(text);
i.hasMoreElements();) {
String word = i.nextToken();
if (listOfWords.contains(word)) {
numDuplicatedWords++;
} else {
listOfWords.add(word);
}
}
long endTime = System.currentTimeMillis();
System.out.println(numDuplicatedWords + " duplicated words. " +
"Using " + listOfWords.getClass().getName() +
", time = " + (endTime - startTime) + "ms.");
}
private static void countDuplicateWordsMap(String text) {
int numDuplicatedWords = 0;
Map wordsToFrequency = new HashMap();
long startTime = System.currentTimeMillis();
for (StringTokenizer i = new StringTokenizer(text);
i.hasMoreElements();) {
String word = i.nextToken();
Integer frequency = (Integer)wordsToFrequency.get(word);
if (frequency == null) {
wordsToFrequency.put(word, new Integer(0));
} else {
int value = frequency.intValue();
wordsToFrequency.put(word, new Integer(value + 1));
numDuplicatedWords++;
}
}
long endTime = System.currentTimeMillis();
System.out.println(numDuplicatedWords + " duplicated words. " +
"Using Map " +
", time = " + (endTime - startTime) + "ms.");
}
}
..ed
--
www.EdmundKirwan.com - Home of The Fractal Class Composition