单词检查:判断文件中的英文单词是否正确，若不正确，排序输出到另一个文件中

走过的绿柳荫

于 2022-10-13 00:12:34 发布

阅读量1.1k

点赞数 6

分类专栏：摸鱼算法题文章标签： java 算法开发语言

本文链接：https://blog.csdn.net/qq_55880219/article/details/127293175

版权

摸鱼算法题专栏收录该内容

3 篇文章 1 订阅

订阅专栏

文章目录

问题描述

已知有一个正确单词索引表（保存在当前目录下的文件index.txt中，且全为小写字母，按照字典序由小到大排列，每个单词独占一行），编写程序利用该单词表对某一英文文章（保存在当前目录下的另一个文件in.txt中）进行单词正确性检查，若该英文文章中出现的单词（只有连续字母组成）没有出现在单词索引文件中（检查时大小写无关），则将该出错的单词（其中的字母全部转换为小写）输出到当前目录下的另一文件error.txt中，每个单词独占一行，并且以字典序由小到大的顺序输出。
假设：
1、in.txt中的文章有可能没有经过排版，格式有可能杂乱无章，也有可能没有写完整。
2、index.txt中的单词个数不超过1000个，每个单词的长度不超过50个字母
3、若出错的单词多次出现，则多次输出。

【输入形式】

保存单词索引表的文件index.txt和保存英文文章的文件in.txt都位于当前目录下。

【输出形式】

将出错的单词以字典序由小到大的顺序输出到当前目录下的文件error.txt中，每个单词单独占一行，多次出错的单词多次输出。若没有出现错误单词，则什么也不输出。

【样例输入1】

假设文件in.txt内容为：

There are two verrsions of the international standards for C. 
Thee first version was ratified in 1989 by the American National
Standards Institue (ANS1) C standard committee.It is often 
referred as ANS1 C or C89. The secand C standard was completed 
in 1999. This standard is comonly referred to as C99. C99 is a 
milestone in C's evolution into a viable programing languga 
for numerical and scientific computing.

文件index.txt中的单词索引表内容为：

a
american
and
ansi
are
as
by
c
committee
commonly
completed
computing
evolution
first
for
in
institue
international
into
is
it
language
milestone
national
numerical
of
often
or
programming
ratified
referred
s
scientific
secand
standard
standards
the
there
this
to
two
version
versions
viable
was

【样例输出1】

文件error.txt中出错的单词应为：

ans
ans
comonly
languga
programing
thee
verrsions

【样例1说明】

用index.txt中的单词索引表对in.txt中出现的每一个单词进行检查，检查时大小写无关，所以第一个单词There出现在索引表中，不是错误单词；单词verrsions没有出现在索引表中，拼写错误，所以作为出错单词输出；单词ANSI拼写成了ANS1，将其中字母都转换为小写后输出，并且多次出现，多次输出；其他出错单词类似。错误单词输出按照字典序由小到大输出到error.txt文件中。

【样例输入2】

假设文件in.txt内容为：

There are two versions of the international standard fo

文件index.txt中的单词索引表内容为：

are
for
international
of
standards
the
there
two
versions

【样例输出2】

文件error.txt中出错的单词应为：

fo
standard

【样例2说明】

文件in.txt中的单词standard没有出现在索引表文件index.txt中，所以作为错误单词输出。

注意：样例2中in.txt文件内容还不完整，最后的单词fo后没有任何字符，fo也没有出现在索引表中，所以也作为错误单词输出。

思路

1、从两个文件中读取单词，并且存入ArrayList中

2、通过listIndex,判断listIn中是否有错误单词，及未出现的单词（循环查找错误单词），找到之后在进行排序

3、最后将错误的单词存入error.txt中

代码

import java.io.*;
import java.util.ArrayList;
import java.util.Comparator;

public class Match {
    public static void main(String [] args){
        File fileIn = new File("in.txt");
        File fileIndex = new File("index.txt");
        ArrayList<String> listIn = new ArrayList<String>();
        ArrayList<String> listIndex = new ArrayList<String>();
        ArrayList<String> listError=new ArrayList<String>();
        listIn = Input(fileIn);
        listIndex = Input(fileIndex);
//        System.out.println(fileIndex);
        listError=matchError(listIn,listIndex);
//        show(listError);
        writerError(listError);
    }
    //从文件中读取单词
    public static ArrayList Input(File file){
        BufferedReader br = null;
        ArrayList<String> listIn = new ArrayList<String>();
        try {
            br = new BufferedReader(new FileReader(file));
            String str = null;
            while((str = br.readLine()) != null) {
                String[] wordsArr1 = str.split("[^a-zA-Z]");  //过滤出只含有字母的
                for (String word : wordsArr1) {
                    if(word.length() != 0){  //去除长度为0的行
                        listIn.add(word.toLowerCase());   //toLowCase()统一转成小写
//                        System.out.println(word);
                    }
                }
            }
            br.close();

        } catch (IOException e) {
            e.printStackTrace();
        }
        return listIn;
    }
	//查找错误单词并排序
    public static ArrayList matchError(ArrayList listIn,ArrayList listIndex){
        ArrayList<String> listError=new ArrayList<String>();

        for(int i=0;i<listIn.size();i++){
            if(!listIndex.contains(listIn.get(i))){     //contains() 方法用于判断元素是否在动态数组中
                listError.add((String) listIn.get(i));
            }
        }
        listError.sort(Comparator.naturalOrder());    //naturalOrder() 方法指定元素以自然顺序（升序）排序。
        return listError;
    }
    public static void show(ArrayList list){
        System.out.println("size:"+list.size());
        for(int i=0;i<list.size();i++){
            System.out.println(list.get(i));
        }
    }
    //将错误单词写入error.txt中
    public static void writerError(ArrayList listError){
        File file = new File("error.txt");
//        if(!file.exists()) {
//           try{
//               file.createNewFile();
//           }catch (IOException e){
//               e.printStackTrace();
//           }
//        }
        try {
            BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file)));
            for(int i=0;i<listError.size();i++){
                out.write((String) listError.get(i)+'\n');
            }
            out.flush();
            out.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

走过的绿柳荫

关注

6
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
单词检查:判断文件中的英文单词是否正确，若不正确，排序输出到另一个文件中

已知有一个正确单词索引表（保存在当前目录下的文件index.txt中，且全为小写字母，按照字典序由小到大排列，每个单词独占一行），编写程序利用该单词表对某一英文文章（保存在当前目录下的另一个文件in.txt中）进行单词正确性检查，若该英文文章中出现的单词（只有连续字母组成）没有出现在单词索引文件中（检查时大小写无关），则将该出错的单词（其中的字母全部转换为小写）输出到当前目录下的另一文件error.txt中，每个单词独占一行，并且以字典序由小到大的顺序输出。
复制链接

扫一扫

专栏目录