编写一个(C语言)词法分析器:
需求是:1对原来的数据进行预处理,删掉注释;(为了展示方便,就不删掉换行,制表符了,本来应该是要删掉这些的)
2将词法正确的token分解出来,一共应该有5类,标识符,关键字,常数,界符,运算符,对于原来的源码,将token之间都加上空格;
3对于词法不正确的token进行提示,表示词法不正确,具体有:浮点数的不正确,如.11,0.23.34,这样的;
编程环境是:webstorm和nodejs
关于思路可以参考我的上一篇博客,在原来的基础上有了一些改进;
这是预处理部分的DFA:
这是处理部分的DFA:
最后这是关于词法分析器部分的源码:
const fs = require("fs");
var data="";
const redline = require("readline");
data = fs.readFileSync("test2.txt");
var tem = data.toString();
//定义关键字集合
const keys = ["auto","break","case","switch","char","const","continue","default","do","while",
"double","else","if","enum","extern","float","for","goto","int","long","register","return","short",
"signed","sizeof","static","struct","typedef","unino","unsigned","void","volatile"];
//定义字母集合,这样比较方便(没有用正则)
const charArray = "qwertyuiopasdfghjklzxcvbnm"
+"QWERTYUIOPASDFGHJKLZXCVBNM";
//定义界符和运算符集合
const symbols =[
"+","-","*","/","<","<=",">",">=","=","==",
"!=",";","(",")","^",",","\"","\'","#","&",
"&&","|","||","%","~","<<",">>","[","]","{",
"}","\\",".","\?",":","!"];
const digitArray = ['0','1','2','3','4','5','6','7','8','9'];
//预处理函数,将注释去掉
fs.writeFileSync("test3.txt",'输出结果如下:\r\n'