编译原理实验六:对多条执行语句的递归下降分析

实验要求

【任务介绍】根据给定的上下文无关文法,对高级程序设计语言中常见的几种执行语句进行语法分析。

【输入】一串执行语句,其中包括:赋值语句、选择语句和循环语句。

【输出】与输入对应的一颗完整的语法树或者错误。

【题目】设计一个程序,根据给定的上下文无关文法,对于输入的一串源程序语句,构造其对应的语法树或者报告错误。要求:

1.基础文法以为开始符号:

<Block> → { <Decls> <STMTS> } 
<Decls> → <Decls> <Decl> | empty 
<Decl> → <Type> <NameList> ; 
<NameList> → <NameList> , <Name> | <Name>
<Type> → int 
<Name> → id 
<STMTS> → <STMTS> <STMT> | empty 
<STMT> → <Name> = <Expr> ; 
<STMT> → if ( BOOL ) <STMT> 
<STMT> → if ( BOOL ) <STMT> else <STMT> 
<STMT> → while ( BOOL ) <STMT>

2.语法分析方法采用递归子程序法。

3.输入:一串(3∽5句)执行语句,其中包括:赋值语句、选择语句和循环语句。

4.输出:输入正确时,输出其对应的语法树,树根标记为< Block>;输入错误时,输出error。

5.赋值语句:左部为1个简单变量(假设都定义为整型),右部为1个算术表达式;可以调用2.1中程

序来完成对这个算术表达式的分析。

6.选择语句:包含if-then单分支和if-then-else双分支两种结构。只考虑分支判定条件为1个简单的关系运算表达式的情况,暂不处理逻辑运算。

7.循环语句:包含while-do、do-while 和 for-each 三种结构中的任一种。

编译环境和语言

编程语言:C++

IDE:vs 2019

实验原理分析

首先,我们可以马上发现给出的文法含有左递归,因此我们需要先消除左递归(比如这个文法定义<Decls> → <Decls> <Decl> | empty就是典型的左递归),根据理论课上的消除左递归的方法,我们可以得到如下文法:

<Block> → { <Decls> <STMTS> } 
<Decls> → <Type> <NameList> ; <Decls> | empty 

<NameList> → <Name> <NameList1>
<NameList1> → , <Name> <NameList1> | empty
<Type> → int 
<Name> → id 

<STMTS> → <STMT> <STMTS> | empty 
<STMT> → <Name> = <Expr> ; 
<STMT> → if ( <BOOL> ) <STMT> <STMT1>
<STMT1> → else <STMT> | empty
<STMT> → while ( <BOOL> ) <STMT>

<BOOL> → <Expr> <RelOp> <Expr>
<RelOp> → < | <= | > | >= | == | !=

<Expr> → <Term> <Expr1> 
<Expr1> → <AddOp> <Term> <Expr1> | empty 
<Term> → <Factor> <Term1> 
<Term1> → <MulOp> <Factor> <Term1> | empty 
<Factor> → id | number | ( <Expr> ) 
<AddOp> → + | - 
<MulOp> → * | /

可以发现,上述文法其实只是涉及到了我们实际使用的一部分,就比如表达式部分的文法定义,只是涉及到了赋值表达式,if语句(不含花括号的最小形式)、if-else语句(不含花括号的最小形式)、while语句(不含花括号的最小形式),但是我们其实也可以发现,这些文法其实基本上都是殊途同归的,因此如果只是为了学习的话,并不需要太过完整,毕竟越完整的文法也就意味着越复杂的程序。

程序关键部分分析

定义

char s[100][100] = { "\0" };  //用来存储初始数据
string str;  //用来存储整合后的数据
int location = 0;  //用来定位算术表达式
bool flag = true;  //用来判断该算术表达式是否合法
string tree_map[100];  //用来存储语法树
const int width = 3;  //设置间隔为3
char token[100] = { "\0" };  //用来暂存单词

bool isKey(char* s);
bool isOP(char* s);
bool isDE(char& s);
void pre_process(char* buff, int& in_comment);
bool scanner(int k);

int draw_line(int row, int num);
void string_out(string s, int row, int column, int loc);
int tree_out(string s, int row, int loc);
void printTree(ofstream& fout);
int readToken();
void bindString(int k);
int Block(int row, int column);
int Decls(int row, int column);
int NameList(int row, int column);
int NameList1(int row, int column);
bool Type(char* words);
bool Name(char* words);
int STMTS(int row, int column);
int STMT(int row, int column);
int STMT1(int row, int column);
int BOOL(int row, int column);
bool RelOp(char* words);
int Expr(int row, int column);
int Expr1(int row, int column);
int Term(int row, int column);
int Term1(int row, int column);
int Factor(int row, int column);
bool AddOp(char* words);
bool MulOp(char* words);

关键部分分析

首先词法分析部分在实验三中已经介绍过,只是scanner()函数稍有改动,就只是对输入的数据进行了格式的处理,然后简单判断是否有错:

bool scanner(int k) {  //词法分析处理
    int in_comment = 0;  //0表示没问题,1表示在多行注释中,2表示在单行注释中,3表示在双引号中,4表示在单引号中
    for(int i = 0; i < k; i++) {
        pre_process(s[i], in_comment);  //首先预处理,去掉注释,词与词之间、词与运算符之间用一个空格隔开
    }
    if (in_comment != 0) return false;  //若标志位不等于0,则说明多行注释不到位,没有结束标志
    else return true;
}

然后便是实验五已经介绍过的构造语法树的一些函数,首先是draw_line(int row, int num):

int draw_line(int row, int num) {  //用来画横线,隔开兄弟节点,返回下次开始的起始位置
    tree_map[row].append(num, '-');
    return tree_map[row].size();
}

其次string_out(string s, int row, int column, int loc = 0)函数相比实验五有所修改,改动之处在于多了一个默认参数loc,方便用于处理字符串的位置:

/**用来输出字符串
* 其中column为该行的起始位置,loc为上一行竖线的位置,
* loc默认为0,表示没有竖线,则此时通过column将该字符串放入到相应位置
* 若不为0,则通过loc对该字符串进行位置的处理
*/
void string_out(string s, int row, int column, int loc = 0) {  
    if (loc == 0) {
        if (tree_map[row].size() < column) {  //若不等,则说明中间需要填充空格
            int n = column - tree_map[row].size();
            tree_map[row].append(n, ' ');
        }
        tree_map[row].append(s);
    } else {
        int n1 = s.size() / 2;
        if (loc - n1 <= column) {  //若该节点的长度比父节点长,则还是通过column添加
            if (tree_map[row].size() < column) {  //若不等,则说明中间需要填充空格
                int n = column - tree_map[row].size();
                tree_map[row].append(n, ' ');
            }
            tree_map[row].append(s);
        } else {  //这种情况必须填充空格
            int n = loc - n1 - tree_map[row].size();
            tree_map[row].append(n, ' ');
            tree_map[row].append(s);
        }
    }
}

tree_out(string s, int row, int column)函数和实验五是一样的:

/**画父子节点之间的竖线,s表示父亲节点的字符,loc表示父亲节点的起始位置
* 返回值用于处理运算符的位置
*/
int tree_out(string s, int row, int column) {
    int n1 = s.size() / 2;
    int n2 = column + n1 - tree_map[row].size();
    tree_map[row].append(n2, ' ');
    tree_map[row] += '|';
    return n1 + column;
}

printTree(ofstream& fout)函数将语法树输出到文件中:

void printTree(ofstream& fout) {
	for (int i = 0; i < 100; i++) {
		if (!tree_map[i].empty()) {
			fout << tree_map[i] << endl;
		} else break;
	}
}

因为词法分析结束后,词与词之间用空格隔开,因此每次根据空格来取词:

int readToken() {  //用来根据空格从str中取词,并返回该词的长度,以便进行移位操作
    int i = 0;
    for (; str[location + i] != ' '; i++) {
        token[i] = str[location + i];
    }
    token[i] = '\0';
    return i;
}

bindString(int k)用于将从键盘中得到的多行数据整合到一个string类型的str中:

void bindString(int k) {  //用来将s数组中的内容整合到str中
	for (int i = 0; i <= k; i++) {
		str.append(s[i]);
	}
}

接下来便是和文法定义有关的函数,返回值用于上一级画横线的长度处理:

int Block(int row, int column) {
	if (flag) {
        string_out("<Block>", row, column);
        int loc = tree_out("<Block>", ++row, column);
        int i = readToken();
        if (strcmp(token, "{") == 0) {
            location = location + i + 1;
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
			int num1 = Decls(row, column);
            column = draw_line(row, num1 + width);
			int num2 = STMTS(row, column);
            column = draw_line(row, num2 + width);
            i = readToken();
            if (strcmp(token, "}") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                return num1 + num2 + width * 3 + 1 + 7;
			} else {
                flag = false;
                return 0;
			}
		} else {
			flag = false;
			return 0;
		}
	}
}

int Decls(int row, int column) {
	if (flag) {
        string_out("<Decls>", row, column);
        int loc = tree_out("<Decls>", ++row, column);
        int i = readToken();
		if (Type(token)) {
            location = location + i + 1;
            string_out("<Type>", ++row, column, loc);
            loc = tree_out("<Type>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
			int num1 = NameList(row, column);
            column = draw_line(row, num1 + width);
            i = readToken();
            if (strcmp(token, ";") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
				int num2 = Decls(row, column);
                return num1 + num2 + width * 3 + 1 + 7;
			} else {
				flag = false;
				return 0;
			}
		} else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
			return 7;
		}
	}
}

int NameList(int row, int column) {
    if (flag) {
        string_out("<NameList>", row, column);
        int loc = tree_out("<NameList>", ++row, column);
        int i = readToken();
        if (Name(token)) {
            location = location + i + 1;
            string_out("<Name>", ++row, column, loc);
            loc = tree_out("<Name>", row + 1, column + 2);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            int num1 = NameList1(row, column);
            return num1 + width + 10;
        } else {
            flag = false;
            return 0;
        }
    }
}

int NameList1(int row, int column) {
    if (flag) {
        string_out("<NameList1>", row, column);
        int loc = tree_out("<NameList1>", ++row, column);
        int i = readToken();
        if (strcmp(token, ",") == 0) {
            location = location + i + 1;
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (Name(token)) {
                location = location + i + 1;
                string_out("<Name>", ++row, column);
                tree_out("<Name>", row + 1, column);
                string_out(token, row + 2, column);
                column = draw_line(row, width);
                int num1 = NameList1(row, column);
                return num1 + 6 + width * 2 + 11;
            } else {
                flag = false;
                return 0;
            }
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 11;
        }
    }
}

bool Type(char* words) {
    if (strcmp(words, "int") == 0) return true;
	else return false;
}

bool Name(char* words) {
    if (!isOP(words) && !isKey(words) && !isDE(words[0]) && !isdigit(words[0]) && words[0] != '\'' && words[0] != '\"') {
        if (words[0] == '_' || isalpha(words[0])) return true;
    }
    return false;
}

int STMTS(int row, int column) {
    if (flag) {
        string_out("<STMTS>", row, column);
        int loc = tree_out("<STMTS>", ++row, column);
        int i = readToken();
        if (Name(token) || strcmp(token, "if") == 0 || strcmp(token, "while") == 0) {
            int num1 = STMT(++row, column);
            column = draw_line(row, num1 + width);
            int num2 = STMTS(row, column);
            return num1 + num2 + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int STMT(int row, int column) {
    if (flag) {
        string_out("<STMT>", row, column);
        int loc = tree_out("<STMT>", ++row, column);
        int i = readToken();
        location = location + i + 1;
        if (Name(token)) {  //若是标识符
            string_out("<Name>", ++row, column, loc);
            tree_out("<Name>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (strcmp(token, "=") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
                int num1 = Expr(row, column);
                column = draw_line(row, num1 + width);
                i = readToken();
                if (strcmp(token, ";") == 0) {
                    location = location + i + 1;
                    string_out(token, row, column);
                    return num1 + width * 3 + 2 + 6;
                } else {
                    flag = false;
                    return 0;
                }
            } else {
                flag = false;
                return 0;
            }
        } else if (strcmp(token, "if") == 0) {
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (strcmp(token, "(") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
                int num1 = BOOL(row, column);
                column = draw_line(row, num1 + width);
                i = readToken();
                if (strcmp(token, ")") == 0) {
                    location = location + i + 1;
                    string_out(token, row, column);
                    column = draw_line(row, width);
                    int num2 = STMT(row, column);
                    column = draw_line(row, num2 + width);
                    int num3 = STMT1(row, column);
                    return num1 + num2 + num3 + width * 5 + 2 + 6;
                } else {
                    flag = false;
                    return 0;
                }
            } else {
                flag = false;
                return 0;
            }
        } else if (strcmp(token, "while") == 0) {
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (strcmp(token, "(") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
                int num1 = BOOL(row, column);
                column = draw_line(row, num1 + width);
                i = readToken();
                if (strcmp(token, ")") == 0) {
                    location = location + i + 1;
                    string_out(token, row, column);
                    column = draw_line(row, width);
                    int num2 = STMT(row, column);
                    return num1 + num2 + width * 4 + 2 + 6;
                } else {
                    flag = false;
                    return 0;
                }
            } else {
                flag = false;
                return 0;
            }
        } else {
            flag = false;
            return 0;
        }
    }
}

int STMT1(int row, int column) {
    if (flag) {
        string_out("<STMT1>", row, column);
        int loc = tree_out("<STMT1>", ++row, column);
        int i = readToken();
        if (strcmp(token, "else") == 0) {
            location = location + i + 1;
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            int num1 = STMT(row, column);
            return num1 + width + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int BOOL(int row, int column) {
    if (flag) {
        string_out("<BOOL>", row, column);
        tree_out("<BOOL>", ++row, column);
        int num1 = Expr(++row, column);
        column = draw_line(row, num1 + width);
        int i = readToken();
        if (RelOp(token)) {  //若是关系运算符
            location = location + i + 1;
            string_out("<RelOp>", row, column);
            int loc = tree_out("<RelOp>", row + 1, column);
            string_out(token, row + 2, loc);
            column = draw_line(row, width);
            int num2 = Expr(row, column);
            return num1 + num2 + width * 2 + 6;
        } else {
            flag = false;
            return 0;
        }
    }
}

bool RelOp(char* words) {
    if (strcmp(words, "<") == 0 || strcmp(words, "<=") == 0 || strcmp(words, ">") == 0 || strcmp(words, ">=") == 0 || strcmp(words, "==") == 0 || strcmp(words, "!=") == 0) {
        return true;
    }return false;
}

int Expr(int row, int column) {
    if (flag) {
        string_out("<Expr>", row, column);
        tree_out("<Expr>", ++row, column);
        int num1 = Term(++row, column);
        column = draw_line(row, num1 + width);
        int num2 = Expr1(row, column);
        return num1 + num2 + width + 6;
    }
}

int Expr1(int row, int column) {
    if (flag) {
        string_out("<Expr1>", row, column);
        int loc = tree_out("<Expr1>", ++row, column);
        int i = readToken();
        if (AddOp(token)) {  //若字符为+或-
            location = location + i + 1;
            string_out("<AddOp>", ++row, column, loc);
            loc = tree_out("<AddOp>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            int num1 = Term(row, column);
            column = draw_line(row, num1 + width);
            int num2 = Expr1(row, column);
            return num1 + num2 + width * 2 + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int Term(int row, int column) {
    if (flag) {
        string_out("<Term>", row, column);
        tree_out("<Term>", ++row, column);
        int num1 = Factor(++row, column);
        column = draw_line(row, num1 + width);
        int num2 = Term1(row, column);
        return num1 + num2 + width + 6;
    }
}

int Term1(int row, int column) {
    if (flag) {
        string_out("<Term1>", row, column);
        int loc = tree_out("<Term1>", ++row, column);
        int i = readToken();
        if (MulOp(token)) {  //若字符为*或/
            location = location + i + 1;
            string_out("<MulOp>", ++row, column, loc);
            loc = tree_out("<MulOp>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            int num1 = Factor(row, column);
            column = draw_line(row, num1 + width);
            int num2 = Term1(row, column);
            return num1 + num2 + width * 2 + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int Factor(int row, int column) {
    if (flag) {
        string_out("<Factor>", row, column);
        int loc = tree_out("<Factor>", ++row, column);
        int i = readToken();
        location = location + i + 1;
        if (Name(token)) {
            string_out(token, ++row, column, loc);
            return 8;
        } else if (isdigit(token[0])) {
            string_out(token, ++row, column, loc);
            return 8;
        } else if (strcmp(token, "(") == 0) {
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            int num1 = Expr(row, column);
            i = readToken();
            if (strcmp(token, ")") == 0) {
                location = location + i + 1;
                column = draw_line(row, num1 + width);
                string_out(token, row, column);
                return num1 + width * 2 + 8;
            } else {  //若一直没有),则说明该算术表达式错误
                flag = false;
                return 0;
            }
        } else {
            flag = false;
            return 0;
        }
    }
}

bool AddOp(char* words) {
    if (strcmp(words, "+") == 0 || strcmp(words, "-") == 0)return true;
    return false;
}

bool MulOp(char* words) {
    if (strcmp(words, "*") == 0 || strcmp(words, "/") == 0)return true;
    return false;
}

最后在main()函数中启动,首先对输入的数据进行词法分析处理,若词法分析有错,则直接报错提示预处理未通过;若没问题,则首先整合数据,然后进行语法分析,结束后若没有遇到#,则说明语法分析有错,则直接报错提示语法分析未通过;若语法分析没问题,则将语法树输出到output.txt:

int main() {
	int k = 0;
	cout << "请输入一个代码块(#表示结束):" << endl;
	cin.getline(s[0], 100);
	while (k < 100 && strcmp(s[k], "#") != 0) {
		cin.getline(s[++k], 100);
	}
    if (scanner(k)) {  //先进行词法分析
        bindString(k);  //将多行输入数据整合到string类型的str中
        cout << str << endl;  
        Block(0, 0);  //进入语法分析
        if (str[location] == '#') {
            cout << "Correct!" << endl;
            cout << "接下来输出语法树!" << endl;
            ofstream fout("output.txt");
            printTree(fout);
            fout.close();
            cout << "输出成功!请查收output.txt文件!" << endl;
        } else {
            cout << "Error!语法分析未通过!" << endl;
        }
    } else {
        cout << "Error!预处理未通过!" << endl;
    }
	return 0;
}

程序测试

1、输入数据为

{
	int x;
	x = 2;
}
#

运行结果如下:

在这里插入图片描述
在这里插入图片描述

由于Windows自带的记事本中字符与空格的宽度不一样,从而导致显示有问题,因此这里使用的是sublime查看。

2、输入数据为

{	int x;	x = 1;	if(x<2)x=2;	while(x>1)x=x-1;}#

运行结果如下:

在这里插入图片描述
在这里插入图片描述

可以看到,由于程序段有点多,每一行的列数都超过了sublime所能够表示的范围(sublime的列数最大为160),导致出现了以上情况,比如上图中的第3列,占用了4行的空间,也就是说我的这颗语法树的列数差不多有160*4=640列,但是可以确定的是,我的程序是正确的,并且语法树应该也是没有问题的(可以参考前面短的测试数据),只是因为列数太长从而导致显示有些问题。

总结

由于提供的文法很明显具有左递归,而我们又是使用的递归下降,因此在开始写代码前,我们必须先将文法中的左递归文法消除,才能够正式开始下一步的工作,因此我一开始便在整理所有的文法,结合理论课上学到的消除左递归的方法消除了所有的左递归的文法,理清思路之后,再根据实验五的实现思路来实现,剩下的便是一些重复的工作。

对于测试结果的问题主要就还是实验五提到的那个问题,因为我构造语法树的思路,所以构造出来的语法树的列数会非常庞大,因此粗看下来输出结果不像一棵树。

完整代码

#include<iostream>
#include<fstream>
#include<string>
#include<cctype>
using namespace std;

char s[100][100] = { "\0" };  //用来存储初始数据
string str;  //用来存储整合后的数据
int location = 0;  //用来定位算术表达式
bool flag = true;  //用来判断该算术表达式是否合法
string tree_map[100];  //用来存储语法树
const int width = 3;  //设置间隔为3
char token[100] = { "\0" };  //用来暂存单词

bool isKey(char* s);
bool isOP(char* s);
bool isDE(char& s);
void pre_process(char* buff, int& in_comment);
bool scanner(int k);

char keywords[34][20] = {  //关键字,包括main在内共有34个
	"auto", "short", "int", "long", "float", "double", "char", "struct",
	"union", "enum", "typedef", "const", "unsigned", "signed", "extern",
	"register", "static", "volatile", "void", "if", "else", "switch",
	"case", "for", "do", "while", "goto", "continue", "break", "default",
	"sizeof", "return", "main", "include"
};
char operators[38][10] = {  //运算符,共38个
	"+", "-", "*", "/", "%", "++", "--", "==", "!=", ">", ">=", "<", "<=",
	"&&", "||", "!", "=", "+=", "-=", "*=", "/=", "%=", "<<=", ">>=", "&=",
	"^=", "|=", "&", "|", "^", "~", "<<", ">>", "?", ":", ",", ".", "->"
};
char delimiters[7] = { '(', ')', '[', ']', '{', '}' , ';' };  //分隔符,共7个

bool isKey(char* s) {  //用来判断字符串是否为关键字,是则返回true,否则返回false
	for (int i = 0; i < sizeof(keywords) / sizeof(keywords[0]); i++) {
		if (strcmp(s, keywords[i]) == 0) return true;
	}return false;
}

bool isOP(char* s) {  //用来判断字符串是否为运算符,是则返回true,否则返回false
	for (int i = 0; i < sizeof(operators) / sizeof(operators[0]); i++) {
		if (strcmp(s, operators[i]) == 0) return true;
	}return false;
}

bool isDE(char& s) {  //用来判断字符是否为分隔符,是则返回true,否则返回false
	if (strchr(delimiters, s) != NULL) return true;
	return false;
}

void pre_process(char* buff, int& in_comment) {  //预处理
    char data[100] = { '\0' };  //用来存储处理过的数据
    char old_c = '\0';  //用来存储上一个字符
    char cur_c;  //用来存储当前字符
    int i = 0;  //计数器,记录buff
    int j = 0;  //计数器,记录data
    while (i < strlen(buff)) {  //去注释
        cur_c = buff[i++];  //首先将获取的字符存入缓存中
        switch (in_comment) {
        case 0:
            if (cur_c == '\"') {  //进入双引号中
                data[j++] = cur_c;
                in_comment = 3;
            } else if (cur_c == '\'') {  //进入单引号中
                data[j++] = cur_c;
                in_comment = 4;
            } else if (old_c == '/' && cur_c == '*') {  //进入多行注释中
                j--;
                in_comment = 1;
            } else if (old_c == '/' && cur_c == '/') {  //进入单行注释中
                j--;
                in_comment = 2;
            } else {  //其他情况则直接将数据写入data中
                data[j++] = cur_c;
            }
            break;
        case 1:if (old_c == '*' && cur_c == '/') in_comment = 0;  //多行注释结束
            break;
        case 2:if (i == strlen(buff)) in_comment = 0;  //单行注释到这行结束时标志位置为0
            break;
        case 3:
            data[j++] = cur_c;
            if (cur_c == '\"') in_comment = 0;
            break;
        case 4:
            data[j++] = cur_c;
            if (cur_c == '\'') in_comment = 0;
            break;
        }
        old_c = cur_c;  //保留上一个字符
    }

    i = 0;
    int k = 0;
    while (k < j) {  //分隔词
        if (isalpha(data[k]) || data[k] == '_') {  //若为字母或_
            while (!isDE(data[k]) && strchr("+-*/%=^~&|!><?:,.", data[k]) == NULL && !isspace(data[k])) {
                buff[i++] = data[k++];
            }buff[i++] = ' ';
        } else if (isdigit(data[k])) {  //若为数字
            while (isdigit(data[k])) {
                buff[i++] = data[k++];
            }buff[i++] = ' ';
        } else if (isspace(data[k])) {
            while (isspace(data[k])) {  //若为空白字符
                k++;
            }
        } else if (isDE(data[k])) {  //若为界符
            buff[i++] = data[k++];
            buff[i++] = ' ';
        } else if (data[k] == '\"') {  //若为双引号
            buff[i++] = data[k++];
            while (data[k] != '\"')  buff[i++] = data[k++];
            buff[i++] = data[k++];
            buff[i++] = ' ';
        } else if (data[k] == '\'') {  //若为单引号
            buff[i++] = data[k++];
            while (data[k] != '\'')  buff[i++] = data[k++];
            buff[i++] = data[k++];
            buff[i++] = ' ';
        } else if (strchr("+-*/%=^~&|!><?:,.", data[k]) != NULL) {  //若为运算符,再查看下一个字符,要尽可能多包含一些运算符
            switch (data[k]) {
            case '+':buff[i++] = data[k++];
                if (data[k] == '+' || data[k] == '=') buff[i++] = data[k++];  //为++或+=运算符
                break;
            case '-':buff[i++] = data[k++];
                if (data[k] == '-' || data[k] == '=' || data[k] == '>') buff[i++] = data[k++];  //为--或-=或->运算符
                break;
            case '*':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为*=运算符
                break;
            case '/':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为/=运算符
                break;
            case '%':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为%=运算符
                break;
            case '=':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为==运算符
                break;
            case '^':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为^=运算符
                break;
            case '&':buff[i++] = data[k++];
                if (data[k] == '&' || data[k] == '=') buff[i++] = data[k++];  //为&&或&=运算符
                break;
            case '|':buff[i++] = data[k++];
                if (data[k] == '|' || data[k] == '=') buff[i++] = data[k++];  //为||或|=运算符
                break;
            case '!':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为!=运算符
                break;
            case '>':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为>=运算符
                else if (data[k] == '>') {
                    buff[i++] = data[k++];  //为>>运算符
                    if (data[k] == '=') buff[i++] = data[k++];  //为>>=运算符
                }break;
            case '<':buff[i++] = data[k++];
                if (data[k] == '=') buff[i++] = data[k++];  //为<=运算符
                else if (data[k] == '<') {
                    buff[i++] = data[k++];  //为<<运算符
                    if (data[k] == '<') buff[i++] = data[k++];  //为<<=运算符
                }break;
            default:buff[i++] = data[k++];
            }buff[i++] = ' ';
        }
    }
    buff[i] = '\0';  //处理完以后,会在最后留上一个空格
}

bool scanner(int k) {  //词法分析处理
    int in_comment = 0;  //0表示没问题,1表示在多行注释中,2表示在单行注释中,3表示在双引号中,4表示在单引号中
    for(int i = 0; i < k; i++) {
        pre_process(s[i], in_comment);  //首先预处理,去掉注释,词与词之间、词与运算符之间用一个空格隔开
    }
    if (in_comment != 0) return false;  //若标志位不等于0,则说明多行注释不到位,没有结束标志
    else return true;
}

int draw_line(int row, int num);
void string_out(string s, int row, int column, int loc);
int tree_out(string s, int row, int loc);
void printTree(ofstream& fout);
int readToken();
void bindString(int k);
int Block(int row, int column);
int Decls(int row, int column);
int NameList(int row, int column);
int NameList1(int row, int column);
bool Type(char* words);
bool Name(char* words);
int STMTS(int row, int column);
int STMT(int row, int column);
int STMT1(int row, int column);
int BOOL(int row, int column);
bool RelOp(char* words);
int Expr(int row, int column);
int Expr1(int row, int column);
int Term(int row, int column);
int Term1(int row, int column);
int Factor(int row, int column);
bool AddOp(char* words);
bool MulOp(char* words);

int draw_line(int row, int num) {  //用来画横线,隔开兄弟节点,返回下次开始的起始位置
    tree_map[row].append(num, '-');
    return tree_map[row].size();
}

/**用来输出字符串
* 其中column为该行的起始位置,loc为上一行竖线的位置,
* loc默认为0,表示没有竖线,则此时通过column将该字符串放入到相应位置
* 若不为0,则通过loc对该字符串进行位置的处理
*/
void string_out(string s, int row, int column, int loc = 0) {  
    if (loc == 0) {
        if (tree_map[row].size() < column) {  //若不等,则说明中间需要填充空格
            int n = column - tree_map[row].size();
            tree_map[row].append(n, ' ');
        }
        tree_map[row].append(s);
    } else {
        int n1 = s.size() / 2;
        if (loc - n1 <= column) {  //若该节点的长度比父节点长,则还是通过column添加
            if (tree_map[row].size() < column) {  //若不等,则说明中间需要填充空格
                int n = column - tree_map[row].size();
                tree_map[row].append(n, ' ');
            }
            tree_map[row].append(s);
        } else {  //这种情况必须填充空格
            int n = loc - n1 - tree_map[row].size();
            tree_map[row].append(n, ' ');
            tree_map[row].append(s);
        }
    }
}

/**画父子节点之间的竖线,s表示父亲节点的字符,loc表示父亲节点的起始位置
* 返回值用于处理运算符的位置
*/
int tree_out(string s, int row, int column) {
    int n1 = s.size() / 2;
    int n2 = column + n1 - tree_map[row].size();
    tree_map[row].append(n2, ' ');
    tree_map[row] += '|';
    return n1 + column;
}

void printTree(ofstream& fout) {
	for (int i = 0; i < 100; i++) {
		if (!tree_map[i].empty()) {
			fout << tree_map[i] << endl;
		} else break;
	}
}

int readToken() {  //用来根据空格从str中取词,并返回该词的长度,以便进行移位操作
    int i = 0;
    for (; str[location + i] != ' '; i++) {
        token[i] = str[location + i];
    }
    token[i] = '\0';
    return i;
}

void bindString(int k) {  //用来将s数组中的内容整合到str中
	for (int i = 0; i <= k; i++) {
		str.append(s[i]);
	}
}

int Block(int row, int column) {
	if (flag) {
        string_out("<Block>", row, column);
        int loc = tree_out("<Block>", ++row, column);
        int i = readToken();
        if (strcmp(token, "{") == 0) {
            location = location + i + 1;
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
			int num1 = Decls(row, column);
            column = draw_line(row, num1 + width);
			int num2 = STMTS(row, column);
            column = draw_line(row, num2 + width);
            i = readToken();
            if (strcmp(token, "}") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                return num1 + num2 + width * 3 + 1 + 7;
			} else {
                flag = false;
                return 0;
			}
		} else {
			flag = false;
			return 0;
		}
	}
}

int Decls(int row, int column) {
	if (flag) {
        string_out("<Decls>", row, column);
        int loc = tree_out("<Decls>", ++row, column);
        int i = readToken();
		if (Type(token)) {
            location = location + i + 1;
            string_out("<Type>", ++row, column, loc);
            loc = tree_out("<Type>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
			int num1 = NameList(row, column);
            column = draw_line(row, num1 + width);
            i = readToken();
            if (strcmp(token, ";") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
				int num2 = Decls(row, column);
                return num1 + num2 + width * 3 + 1 + 7;
			} else {
				flag = false;
				return 0;
			}
		} else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
			return 7;
		}
	}
}

int NameList(int row, int column) {
    if (flag) {
        string_out("<NameList>", row, column);
        int loc = tree_out("<NameList>", ++row, column);
        int i = readToken();
        if (Name(token)) {
            location = location + i + 1;
            string_out("<Name>", ++row, column, loc);
            loc = tree_out("<Name>", row + 1, column + 2);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            int num1 = NameList1(row, column);
            return num1 + width + 10;
        } else {
            flag = false;
            return 0;
        }
    }
}

int NameList1(int row, int column) {
    if (flag) {
        string_out("<NameList1>", row, column);
        int loc = tree_out("<NameList1>", ++row, column);
        int i = readToken();
        if (strcmp(token, ",") == 0) {
            location = location + i + 1;
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (Name(token)) {
                location = location + i + 1;
                string_out("<Name>", ++row, column);
                tree_out("<Name>", row + 1, column);
                string_out(token, row + 2, column);
                column = draw_line(row, width);
                int num1 = NameList1(row, column);
                return num1 + 6 + width * 2 + 11;
            } else {
                flag = false;
                return 0;
            }
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 11;
        }
    }
}

bool Type(char* words) {
    if (strcmp(words, "int") == 0) return true;
	else return false;
}

bool Name(char* words) {
    if (!isOP(words) && !isKey(words) && !isDE(words[0]) && !isdigit(words[0]) && words[0] != '\'' && words[0] != '\"') {
        if (words[0] == '_' || isalpha(words[0])) return true;
    }
    return false;
}

int STMTS(int row, int column) {
    if (flag) {
        string_out("<STMTS>", row, column);
        int loc = tree_out("<STMTS>", ++row, column);
        int i = readToken();
        if (Name(token) || strcmp(token, "if") == 0 || strcmp(token, "while") == 0) {
            int num1 = STMT(++row, column);
            column = draw_line(row, num1 + width);
            int num2 = STMTS(row, column);
            return num1 + num2 + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int STMT(int row, int column) {
    if (flag) {
        string_out("<STMT>", row, column);
        int loc = tree_out("<STMT>", ++row, column);
        int i = readToken();
        location = location + i + 1;
        if (Name(token)) {  //若是标识符
            string_out("<Name>", ++row, column, loc);
            tree_out("<Name>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (strcmp(token, "=") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
                int num1 = Expr(row, column);
                column = draw_line(row, num1 + width);
                i = readToken();
                if (strcmp(token, ";") == 0) {
                    location = location + i + 1;
                    string_out(token, row, column);
                    return num1 + width * 3 + 2 + 6;
                } else {
                    flag = false;
                    return 0;
                }
            } else {
                flag = false;
                return 0;
            }
        } else if (strcmp(token, "if") == 0) {
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (strcmp(token, "(") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
                int num1 = BOOL(row, column);
                column = draw_line(row, num1 + width);
                i = readToken();
                if (strcmp(token, ")") == 0) {
                    location = location + i + 1;
                    string_out(token, row, column);
                    column = draw_line(row, width);
                    int num2 = STMT(row, column);
                    column = draw_line(row, num2 + width);
                    int num3 = STMT1(row, column);
                    return num1 + num2 + num3 + width * 5 + 2 + 6;
                } else {
                    flag = false;
                    return 0;
                }
            } else {
                flag = false;
                return 0;
            }
        } else if (strcmp(token, "while") == 0) {
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            i = readToken();
            if (strcmp(token, "(") == 0) {
                location = location + i + 1;
                string_out(token, row, column);
                column = draw_line(row, width);
                int num1 = BOOL(row, column);
                column = draw_line(row, num1 + width);
                i = readToken();
                if (strcmp(token, ")") == 0) {
                    location = location + i + 1;
                    string_out(token, row, column);
                    column = draw_line(row, width);
                    int num2 = STMT(row, column);
                    return num1 + num2 + width * 4 + 2 + 6;
                } else {
                    flag = false;
                    return 0;
                }
            } else {
                flag = false;
                return 0;
            }
        } else {
            flag = false;
            return 0;
        }
    }
}

int STMT1(int row, int column) {
    if (flag) {
        string_out("<STMT1>", row, column);
        int loc = tree_out("<STMT1>", ++row, column);
        int i = readToken();
        if (strcmp(token, "else") == 0) {
            location = location + i + 1;
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            int num1 = STMT(row, column);
            return num1 + width + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int BOOL(int row, int column) {
    if (flag) {
        string_out("<BOOL>", row, column);
        tree_out("<BOOL>", ++row, column);
        int num1 = Expr(++row, column);
        column = draw_line(row, num1 + width);
        int i = readToken();
        if (RelOp(token)) {  //若是关系运算符
            location = location + i + 1;
            string_out("<RelOp>", row, column);
            int loc = tree_out("<RelOp>", row + 1, column);
            string_out(token, row + 2, loc);
            column = draw_line(row, width);
            int num2 = Expr(row, column);
            return num1 + num2 + width * 2 + 6;
        } else {
            flag = false;
            return 0;
        }
    }
}

bool RelOp(char* words) {
    if (strcmp(words, "<") == 0 || strcmp(words, "<=") == 0 || strcmp(words, ">") == 0 || strcmp(words, ">=") == 0 || strcmp(words, "==") == 0 || strcmp(words, "!=") == 0) {
        return true;
    }return false;
}

int Expr(int row, int column) {
    if (flag) {
        string_out("<Expr>", row, column);
        tree_out("<Expr>", ++row, column);
        int num1 = Term(++row, column);
        column = draw_line(row, num1 + width);
        int num2 = Expr1(row, column);
        return num1 + num2 + width + 6;
    }
}

int Expr1(int row, int column) {
    if (flag) {
        string_out("<Expr1>", row, column);
        int loc = tree_out("<Expr1>", ++row, column);
        int i = readToken();
        if (AddOp(token)) {  //若字符为+或-
            location = location + i + 1;
            string_out("<AddOp>", ++row, column, loc);
            loc = tree_out("<AddOp>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            int num1 = Term(row, column);
            column = draw_line(row, num1 + width);
            int num2 = Expr1(row, column);
            return num1 + num2 + width * 2 + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int Term(int row, int column) {
    if (flag) {
        string_out("<Term>", row, column);
        tree_out("<Term>", ++row, column);
        int num1 = Factor(++row, column);
        column = draw_line(row, num1 + width);
        int num2 = Term1(row, column);
        return num1 + num2 + width + 6;
    }
}

int Term1(int row, int column) {
    if (flag) {
        string_out("<Term1>", row, column);
        int loc = tree_out("<Term1>", ++row, column);
        int i = readToken();
        if (MulOp(token)) {  //若字符为*或/
            location = location + i + 1;
            string_out("<MulOp>", ++row, column, loc);
            loc = tree_out("<MulOp>", row + 1, column);
            string_out(token, row + 2, column, loc);
            column = draw_line(row, width);
            int num1 = Factor(row, column);
            column = draw_line(row, num1 + width);
            int num2 = Term1(row, column);
            return num1 + num2 + width * 2 + 7;
        } else {  //否则输出为empty
            string_out("empty", ++row, column, loc);
            return 7;
        }
    }
}

int Factor(int row, int column) {
    if (flag) {
        string_out("<Factor>", row, column);
        int loc = tree_out("<Factor>", ++row, column);
        int i = readToken();
        location = location + i + 1;
        if (Name(token)) {
            string_out(token, ++row, column, loc);
            return 8;
        } else if (isdigit(token[0])) {
            string_out(token, ++row, column, loc);
            return 8;
        } else if (strcmp(token, "(") == 0) {
            string_out(token, ++row, column, loc);
            column = draw_line(row, width);
            int num1 = Expr(row, column);
            i = readToken();
            if (strcmp(token, ")") == 0) {
                location = location + i + 1;
                column = draw_line(row, num1 + width);
                string_out(token, row, column);
                return num1 + width * 2 + 8;
            } else {  //若一直没有),则说明该算术表达式错误
                flag = false;
                return 0;
            }
        } else {
            flag = false;
            return 0;
        }
    }
}

bool AddOp(char* words) {
    if (strcmp(words, "+") == 0 || strcmp(words, "-") == 0)return true;
    return false;
}

bool MulOp(char* words) {
    if (strcmp(words, "*") == 0 || strcmp(words, "/") == 0)return true;
    return false;
}

int main() {
	int k = 0;
	cout << "请输入一个代码块(#表示结束):" << endl;
	cin.getline(s[0], 100);
	while (k < 100 && strcmp(s[k], "#") != 0) {
		cin.getline(s[++k], 100);
	}
    if (scanner(k)) {  //先进行词法分析
        bindString(k);  //将多行输入数据整合到string类型的str中
        cout << str << endl;  
        Block(0, 0);  //进入语法分析
        if (str[location] == '#') {
            cout << "Correct!" << endl;
            cout << "接下来输出语法树!" << endl;
            ofstream fout("output.txt");
            printTree(fout);
            fout.close();
            cout << "输出成功!请查收output.txt文件!" << endl;
        } else {
            cout << "Error!语法分析未通过!" << endl;
        }
    } else {
        cout << "Error!预处理未通过!" << endl;
    }
	return 0;
}
  • 4
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

花无凋零之时

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值