WordCount 编写历程

最新推荐文章于 2022-10-30 12:23:17 发布

djl05120512

最新推荐文章于 2022-10-30 12:23:17 发布

阅读量119

点赞数

原文链接：http://www.cnblogs.com/BluesJiang/p/8605470.html

版权

项目代码：

BluesJiang GitHub

PSP(Personal Software Process)

PSP2.1	PSP阶段	预估耗时实际耗时（分钟）	实际耗时（分钟）
Planning	计划	15	23
Estimate	估计这个任务需要多少时间	10	10
Development	开发	530	892
- Analysis	- 需求分析（包括学习新技术）	100	339
- Design Spec	- 生成设计文档	100	150
- Coding Standard	- 代码规范 (为目前的开发制定合适的规范)	10	8
- Design	- 具体设计	30	23
- Coding	- 具体编码	200	220
- Code Review	- 代码复审	30	34
- Test	- 测试（自我测试，修改代码，提交修改）	60	120
Reporting	报告	190	309
- Test Report	- 测试报告	60	65
- Size Measurement	- 计算工作量	10	12
- Postmortem & Process Improvement Plan	- 事后总结, 并提出过程改进计划	120	232
	合计	745	1234

解题思路

关于题目

首先拿到题目应该是一个对文件内容计数的东西，然后要包装成一个控制台程序，最后再导出成EXE文件使其能在windows上运行，因为开发环境是Mac，所以要注意文件分隔符的问题。

题目分析

从程序的流程来看首先作为一个控制台程序，那么首先就是要想办法处理参数
要求要对文件进行统计字数，所以肯定要涉及到文件读写
对于导出成exe，这是用一个工具能解决的问题，不用担心

问题细化

其中要求的-c,-l,-w参数的实现都没什么大问题，仔细阅读后-s，-a，-e的实现要费一些功夫。

-s 参数要实现文件夹的遍历
-a 参数计划可以用有穷自动状态机
-e 参数的就要对 -w 做一些处理

代码说明

代码整体分为三个类：Main ArgParser WordCounter

Main

主要负责管理整个程序流程的管控，调用解析参数函数以及使用 WordCount 类来统计内容然后输出。

ArgParser

该类主要包含4个接口：

/*  
 *  解析Main提供的原始参数列表
 *  @param args：Main函数提供的参数列表
 *  @return int:成功返回0，解析失败返回-1
 */
public int parse(String[] args);

/*  
 *  获取目标文件，即要分析的文件，因为有-s参数，所以目标分析文件有多个
 *  @return 返回要分析的所有文件的路径
 */
public String[] getTarget();

/*  
 *  检查是否包含有某个参数
 *  @return 返回要分析的所有文件的路径
 */
public boolean containsKey(String key);

/*  
 *  获取参数的目标值，即类似-o,-e的输入文件
 *  @return 返回要分析的所有文件的路径
 */
public String get(String key)

关键函数 parse() 要负责解析各个参数，针对 -o，-e参数的处理，对于有 -s 的选项，生成相应的 Target 列表。

WordCounter

WordCounter 类针对每一个参数设计了一个接口，分别如下：

/**  所有的参数都是针对被处理的文件  **/

// 实现 -c 参数
public long countChar(String filename);

// 实现 -e 参数
public void buildEscapeWord(String filename);

// 实现 -w 参数
public long countWords(String filename);

// 实现 -l 参数
public long countLines(String filename);

// 实现 -a 参数，long数组分别存放 代码行/空行/注释行
public long[] countALines(String filename);

解决问题关键代码

1.参数的解析

for (int i = 0; i < args.length; i++) {
    if (args[i].charAt(0) == '-') {
        char arg = args[i].charAt(1);
        switch (arg) {
            case 'c':
            case 'w':
            case 'l':
            case 'a':
            case 's':
                // 这里是针对共同文件做的操作，目标放在 Target 里
                this.args.put(String.valueOf(arg), "");
                break;
            case 'o':
            case 'e':
                // 这里处理特别的参数，因为其后面跟着一个文件名，所以要直接接受
                if (i+1 < args.length && !args[i+1].startsWith("-")) {
                    this.args.put(String.valueOf(arg), args[i+1]);
                    i++;
                } else {
                    System.out.println(args[i] + " must follow a file");
                    return -1;
                }
                break;
        }
    } else {
        // Target 文件
        this.target.add(args[i]);
    }
}

2.递归目录的询问

// 内容摘自 WordCount 的 buildTarget
int buildTarget(String path) {
    File dir = new File(path);
    if (dir.exists()) {
        File[] files = dir.listFiles();
        for (File file:files) {         // 遍历文件
            if (file.isDirectory()) {
                return buildTarget(file.getPath()); // 递归处理文件目录
            }
            if (file.isFile()) {
                String name = file.getPath();
                if (name.endsWith(this.exten)) {
                    this.target.add(name);
                }
            }
        }
    } else { // 错误处理
        System.out.println("No such file or directory: "+path);
        return -1;
    }
    return 0;
}

3.-a 参数的处理

public long[] countALines(String filename) {
    long[] lines = {0,0,0}; // 分别表示 代码行/空行/注释行
    try {
        // 因为是针对行，所以使用 BufferedReader
        BufferedReader file = new BufferedReader(new FileReader(filename));
        String line = file.readLine();
        // 用于记录是否进入了注释，因为注释中，在注释中不存在代码字符
        boolean intoComment = false;
        while (line != null) {
            long charCount = 0;
            // 标志这一行是否包含路注释
            boolean hasComment = false;
            if (intoComment) hasComment = intoComment;
            for (int i = 0; i < line.length(); i++) {
                char ch = line.charAt(i);
                // 跳过行首的空白符
                if (ch == ' ' || ch == '\t' || ch == '\n') {
                    continue;
                } // 匹配注释开头
                else if (line.charAt(i) == '/' && intoComment == false) {
                    if (i + 1 != line.length()) {
                        // 匹配 /* 型注释
                        if (line.charAt(i+1) == '*') {
                            intoComment = true;
                            hasComment = true;
                            i++;
                        } // 匹配 // 型注释
                        else if (line.charAt(i+1) == '/' ) {
                            hasComment = true;
                            // 后面均为注释内容，不包含代码字符，所以可以提前结束这一行
                            break;
                        } else charCount++;
                    }
                } // 匹配 */ 以结束注释块
                else if (line.charAt(i) == '*' && intoComment == true) {
                    if (i + 1 != line.length()) {
                        if (line.charAt(i+1) == '/') {
                            intoComment = false;
                            i++;
                        }
                    } else charCount++;
                }
                else {
                    // 只有不在注释中才有有效字符
                    if (intoComment == false) {
                        charCount++;
                    }
                }
            }
            // 根据是否有注释和有效字符来区分该行属于哪一类
            if (charCount == 0) {
                if (hasComment) {
                    lines[2]++;
                } else {
                    lines[1]++;
                }
            } else if (charCount == 1) {
                if (hasComment) {
                    lines[1]++;lines[2]++;
                } else lines[1]++;
            } else {
                lines[0]++;
            }
            line = file.readLine();
        }

    } catch (IOException e) {
        e.printStackTrace();
    }
    return lines;
}

4.-e 的实现

public void buildEscapeWord(String filename) {
    this.escapeWord = new HashSet<String>();
    try {
        FileReader file = new FileReader(filename);
        int ch = file.read();
        while (ch != -1) {
            String word = "";
            // 跳过分隔符
            while (ch != -1 && isSep((char)ch)) {
                ch = file.read();
            }
            // 获取单词
            while(ch != -1 && !isSep((char) ch)) {
                word += String.valueOf((char)ch);
                ch = file.read();
            }
            this.escapeWord.add(word);
            if (ch == -1)break;
            ch = file.read();
        }
        file.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

该函数读出所有要忽略的单词，并存在一个 HashSet 中。

在 countWords() 中作如下判断：

if (this.escapeWord != null && this.escapeWord.contains(word)) {
    count--;
}

即可达到目的。

测试设计过程

首先要使测试能自动化高效的实现，所以首先要设计测试流程，所以需要一个测试脚本。

测试的命令如下：

测试目录结构为：

wc.exe
test\
    test_cases\
        ...
    test_output\
        ...
    test.bat

测试脚本 test.bat 内容如下：

..\wc.exe -c test\test_cases\httpd.c -o test\test_output\out1.out
..\wc.exe -w test\test_cases\httpd.c -o test\test_output\out2.out
..\wc.exe -l test\test_cases\httpd.c -o test\test_output\out3.out
..\wc.exe -a test\test_cases\httpd.c -o test\test_output\out4.out
..\wc.exe -c -w -l test\test_cases\httpd.c -o test\test_output\out5.out
..\wc.exe -e test\test_cases\stoptest.txt -c -w test\test_cases\test2.in -o test\test_output\out6.out
..\wc.exe -w -c -l -s test\test_cases\*.in -o test\test_output\out7.out
..\wc.exe -c test\test_cases\empty.c -o test\test_output\out8.out
..\wc.exe -c -l test\test_cases\*.in -a -s -w -o test\test_output\out9.out

首先这个测试了一下参数解析和功能实现的完整性，没有使用错误的参数。但是编程过程中测试时已经添加错误提示并终止程序。

内容统计测试的用例均在 test_case 文件夹中。

转载于:https://www.cnblogs.com/BluesJiang/p/8605470.html

djl05120512

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
WordCount 编写历程

项目代码：BluesJiang GitHubPSP(Personal Software Process)PSP2.1PSP阶段预估耗时实际耗时（分钟）实际耗时（分钟）Planning计划1523Estimate估计这个任务需要多少时间1010Development开发530892- Analysis- 需求分析（包括学习新...
复制链接

扫一扫