java 拼写矫正_如何阻止java拼写检查程序纠正重复的单词

本文介绍了一个Java程序,该程序从网页中提取文本,进行拼写检查。遇到拼写错误时,程序会与dictionary.txt字典对比并提供更正建议。问题在于,当输入中有重复的错别字时,程序会多次打印提示。作者希望优化程序,仅对每个单词的拼写错误提示一次,或者记录错误次数。
摘要由CSDN通过智能技术生成

我已经实现了一个执行以下操作的程序:

将网页中的所有单词扫描为一个字符串(使用jsoup)

过滤掉所有的HTML标记和代码

将这些单词放入拼写检查程序并提供建议

拼写检查程序将dictionary.txt文件加载到数组中,并将字符串输入与字典中的单词进行比较。

我目前的问题是,当输入多次包含相同的单词时,例如“程序最差”,代码将会打印出来

You entered 'teh', did you mean 'the'?

You entered 'teh', did you mean 'the'?有时一个网站会一遍又一遍地重复多个单词,这可能会变得混乱。

如果可能的话,打印这个单词以及拼写错误的次数是完美的,但是对每个单词打印一次限制就足够了。

我的程序有一些方法和两个类,但拼写检查方法如下:

注意:原始代码包含一些删除标点符号的'if'语句,但为清晰起见,我已将其删除。

static boolean suggestWord;

public static String checkWord(String wordToCheck) {

String wordCheck;

String word = wordToCheck.toLowerCase();

if ((wordCheck = (String) dictionary.get(word)) != null) {

suggestWord = false; // no need to ask for suggestion for a correct

// word.

return wordCheck;

}

// If after all of these checks a word could not be corrected, return as

// a misspelled word.

return word;

}临时编辑:根据要求,完整的代码:

1级:

public class ParseCleanCheck {

static Hashtable dictionary;// To store all the words of the

// dictionary

static boolean suggestWord;// To indicate whether the word is spelled

// correctly or not.

static Scanner urlInput = new Scanner(System.in);

public static String cleanString;

public static String url = "";

public static boolean correct = true;

/**

* PARSER METHOD

*/

public static void PageScanner() throws IOException {

System.out.println("Pick an english website to scan.");

// This do-while loop allows the user to try again after a mistake

do {

try {

System.out.println("Enter a URL, starting with http://");

url = urlInput.nextLine();

// This creates a document out of the HTML on the web page

Document doc = Jsoup.connect(url).get();

// This converts the document into a string to be cleaned

String htmlToClean = doc.toString();

cleanString = Jsoup.clean(htmlToClean, Whitelist.none());

correct = false;

} catch (Exception e) {

System.out.println("Incorrect format for a URL. Please try again.");

}

} while (correct);

}

/**

* SPELL CHECKER METHOD

*/

public static void SpellChecker() throws IOException {

dictionary = new Hashtable();

System.out.println("Searching for spelling errors ... ");

try {

// Read and store the words of the dictionary

BufferedReader dictReader = new BufferedReader(new FileReader("dictionary.txt"));

while (dictReader.ready()) {

String dictInput = dictReader.readLine();

String[] dict = dictInput.split("\\s"); // create an array of

// dictionary words

for (int i = 0; i < dict.length; i++) {

// key and value are identical

dictionary.put(dict[i], dict[i]);

}

}

dictReader.close();

String user_text = "";

// Initializing a spelling suggestion object based on probability

SuggestSpelling suggest = new SuggestSpelling("wordprobabilityDatabase.txt");

// get user input for correction

{

user_text = cleanString;

String[] words = user_text.split(" ");

int error = 0;

for (String word : words) {

if(!dictionary.contains(word)) {

checkWord(word);

dictionary.put(word, word);

}

suggestWord = true;

String outputWord = checkWord(word);

if (suggestWord) {

System.out.println("Suggestions for " + word + " are: " + suggest.correct(outputWord) + "\n");

error++;

}

}

if (error == 0) {

System.out.println("No mistakes found");

}

}

} catch (IOException e) {

e.printStackTrace();

System.exit(-1);

}

}

/**

* METHOD TO SPELL CHECK THE WORDS IN A STRING. IS USED IN SPELL CHECKER

* METHOD THROUGH THE "WORD" STRING

*/

public static String checkWord(String wordToCheck) {

String wordCheck;

String word = wordToCheck.toLowerCase();

if ((wordCheck = (String) dictionary.get(word)) != null) {

suggestWord = false; // no need to ask for suggestion for a correct

// word.

return wordCheck;

}

// If after all of these checks a word could not be corrected, return as

// a misspelled word.

return word;

}

}有一个第二类(SuggestSpelling.java),它包含一个概率计算器,但现在并不相关,除非您计划为自己运行代码。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值