java 拼写矫正_如何阻止java拼写检查程序纠正重复的单词

最新推荐文章于 2022-03-05 09:32:15 发布

zbzblr

最新推荐文章于 2022-03-05 09:32:15 发布

阅读量239

点赞数

文章标签： java 拼写矫正

本文链接：https://blog.csdn.net/weixin_36063465/article/details/114224274

版权

本文介绍了一个Java程序，该程序从网页中提取文本，进行拼写检查。遇到拼写错误时，程序会与dictionary.txt字典对比并提供更正建议。问题在于，当输入中有重复的错别字时，程序会多次打印提示。作者希望优化程序，仅对每个单词的拼写错误提示一次，或者记录错误次数。

摘要由CSDN通过智能技术生成

我已经实现了一个执行以下操作的程序：

将网页中的所有单词扫描为一个字符串(使用jsoup)

过滤掉所有的HTML标记和代码

将这些单词放入拼写检查程序并提供建议

拼写检查程序将dictionary.txt文件加载到数组中，并将字符串输入与字典中的单词进行比较。

我目前的问题是，当输入多次包含相同的单词时，例如“程序最差”，代码将会打印出来

You entered 'teh', did you mean 'the'?

You entered 'teh', did you mean 'the'?有时一个网站会一遍又一遍地重复多个单词，这可能会变得混乱。

如果可能的话，打印这个单词以及拼写错误的次数是完美的，但是对每个单词打印一次限制就足够了。

我的程序有一些方法和两个类，但拼写检查方法如下：

注意：原始代码包含一些删除标点符号的'if'语句，但为清晰起见，我已将其删除。

static boolean suggestWord;

public static String checkWord(String wordToCheck) {

String wordCheck;

String word = wordToCheck.toLowerCase();

if ((wordCheck = (String) dictionary.get(word)) != null) {

suggestWord = false; // no need to ask for suggestion for a correct

// word.

return wordCheck;

}

// If after all of these checks a word could not be corrected, return as

// a misspelled word.

return word;

}临时编辑：根据要求，完整的代码：

1级：

public class ParseCleanCheck {

static Hashtable dictionary;// To store all the words of the

// dictionary

static boolean suggestWord;// To indicate whether the word is spelled

// correctly or not.

static Scanner urlInput = new Scanner(System.in);

public static String cleanString;

public static String url = "";

public static boolean correct = true;

/**

* PARSER METHOD

public static void PageScanner() throws IOException {

System.out.println("Pick an english website to scan.");

// This do-while loop allows the user to try again after a mistake

do {

try {

System.out.println("Enter a URL, starting with http://");

url = urlInput.nextLine();

// This creates a document out of the HTML on the web page

Document doc = Jsoup.connect(url).get();

// This converts the document into a string to be cleaned

String htmlToClean = doc.toString();

cleanString = Jsoup.clean(htmlToClean, Whitelist.none());

correct = false;

} catch (Exception e) {

System.out.println("Incorrect format for a URL. Please try again.");

}

} while (correct);

}

/**

* SPELL CHECKER METHOD

public static void SpellChecker() throws IOException {

dictionary = new Hashtable();

System.out.println("Searching for spelling errors ... ");

try {

// Read and store the words of the dictionary

BufferedReader dictReader = new BufferedReader(new FileReader("dictionary.txt"));

while (dictReader.ready()) {

String dictInput = dictReader.readLine();

String[] dict = dictInput.split("\\s"); // create an array of

// dictionary words

for (int i = 0; i < dict.length; i++) {

// key and value are identical

dictionary.put(dict[i], dict[i]);

}

dictReader.close();

String user_text = "";

// Initializing a spelling suggestion object based on probability

SuggestSpelling suggest = new SuggestSpelling("wordprobabilityDatabase.txt");

// get user input for correction

{

user_text = cleanString;

String[] words = user_text.split(" ");

int error = 0;

for (String word : words) {

if(!dictionary.contains(word)) {

checkWord(word);

dictionary.put(word, word);

}

suggestWord = true;

String outputWord = checkWord(word);

if (suggestWord) {

System.out.println("Suggestions for " + word + " are: " + suggest.correct(outputWord) + "\n");

error++;

}

if (error == 0) {

System.out.println("No mistakes found");

}

} catch (IOException e) {

e.printStackTrace();

System.exit(-1);

}

/**

* METHOD TO SPELL CHECK THE WORDS IN A STRING. IS USED IN SPELL CHECKER

* METHOD THROUGH THE "WORD" STRING