正如我在之前的评论中所述,可以使用Map(HashMap)来存储匹配的单词及其出现频率。
我建议将程序的功能封装到较小的方法/类中,以便每个方法/类只执行一项小任务。所以代码可以更好地读取。
我假定你的文件将包含字符串“自动布什胜过她的番茄在矮牵牛汽车”
下面是代码:
package how_to_calculate_the_frequency;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.HashMap;
import java.util.Scanner;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Project {
HashMap<String, Integer> map = new HashMap<String, Integer>();
public static void main(String[] args){
Project project = new Project();
Scanner INPUT_TEXT = project.readFile();
project.analyse(INPUT_TEXT);
project.showResults();
}
/**
* logic to count the occurences of words matched by REGEX in a scanner that
* loaded some text
*
* @param scanner
* the scanner holding the text
*/
public void analyse(Scanner scanner) {
String pattern = "[a-zA-Z'-]+";
Pattern r = Pattern.compile(pattern);
while (scanner.hasNext()) {
// read next word
String Stringcandidate = scanner.next();
// see if pattern matches (boolean find)
Matcher matcher = r.matcher(Stringcandidate);
if (matcher.find()) {
String matchedWord = matcher.group();
//System.out.println(matchedWord); //check what is matched
this.addWord(matchedWord);
}
}
scanner.close();// Close your Scanner.
}
/**
* adds a word to the <word,count> Map if the word is new, a new entry is
* created, otherwise the count of this word is incremented
*/
public void addWord(String matchedWord) {
if (map.containsKey(matchedWord)) {
// increment occurrence
int occurrence = map.get(matchedWord);
occurrence++;
map.put(matchedWord, occurrence);
} else {
// add word and set occurrence to 1
map.put(matchedWord, 1);
}
}
/**
* reads a file from disk and returns a scanner to analyse it
*
* @return the file from disk as scanner
*/
public Scanner readFile() {
Scanner scanner = null;
/* use that for reading a file from disk
* try { scanner = new Scanner(new
* File("moviereview.txt")).useDelimiter(" "); } catch (Exception e) {
* e.printStackTrace(); }
*/
scanner = new Scanner("auto bush trumped her tomato in the petunia auto");
return scanner;
}
/**
* prints the matched words and their occurrences
* in a readable way
*/
public void showResults() {
for (HashMap.Entry<String, Integer> matchedWord : map.entrySet()) {
int occurrence = matchedWord.getValue();
System.out.print("\"" + matchedWord.getKey() + "\" appears " + occurrence);
if (occurrence > 1) {
System.out.print(" times\n");
} else {
System.out.print(" time\n");
}
}
// or as the new Java 8 lambda expression
// map.forEach((word,occurrence)->System.out.println("\"" + word + "\"
// appears " + occurrence + " times"));
}
}
// DONE seperate reading a file, analysing the file and
// word-frequency-counting-logic in different
// methods
// Done implement <word,count> Map and logic to add new and known(to the map)
// words
这产生了:
“的”出现1时间
“自动” 出现2次
“她” AP梨1时间
“在” 出现1次
“衬套” 出现1次
“捏造” 出现1次
“番茄” 出现1次
“矮牵牛”出现1次
关于
你能更具体吗?现在发生了什么?我们不在这里为您运行您的代码。而且我们没有你的文本文件 –
我不能帮你。当你甚至无法正确格式化(缩进)代码以显示代码结构时,我拒绝查看代码。 – Andreas
欢迎来到StackOverflow。如果您按照帮助中心提供的指导方针,最有可能获得有用的答案。例如,像这样:“寻求调试帮助的问题(”为什么这个代码不工作?“)必须包含所需的行为,特定的问题或错误以及在问题本身中重现问题所需的最短代码。没有明确问题陈述的问题对其他读者没有用处。“ –