编译原理之手工构造C语言词法分析器

本文介绍如何编写一个C语言的词法分析器,主要任务包括删除注释、分解不同类型的token(标识符、关键字、常数、界符、运算符),并处理词法错误。使用webstorm和nodejs作为编程环境,通过DFA进行预处理和处理,源码实现能够输出一行源码对应一行token的格式。
摘要由CSDN通过智能技术生成

编写一个(C语言)词法分析器:

需求是:1对原来的数据进行预处理,删掉注释;(为了展示方便,就不删掉换行,制表符了,本来应该是要删掉这些的)

2将词法正确的token分解出来,一共应该有5类,标识符,关键字,常数,界符,运算符,对于原来的源码,将token之间都加上空格;

3对于词法不正确的token进行提示,表示词法不正确,具体有:浮点数的不正确,如.11,0.23.34,这样的;

编程环境是:webstorm和nodejs

关于思路可以参考我的上一篇博客,在原来的基础上有了一些改进;

这是预处理部分的DFA:


这是处理部分的DFA:


最后这是关于词法分析器部分的源码:

const fs = require("fs");
var data="";
const redline = require("readline");
data = fs.readFileSync("test2.txt");
var tem = data.toString();
//定义关键字集合
 const keys = ["auto","break","case","switch","char","const","continue","default","do","while",
"double","else","if","enum","extern","float","for","goto","int","long","register","return","short",
"signed","sizeof","static","struct","typedef","unino","unsigned","void","volatile"];
 //定义字母集合,这样比较方便(没有用正则)
const charArray = "qwertyuiopasdfghjklzxcvbnm"
    +"QWERTYUIOPASDFGHJKLZXCVBNM";
//定义界符和运算符集合
const symbols =[
    "+","-","*","/","<","<=",">",">=","=","==",
    "!=",";","(",")","^",",","\"","\'","#","&",
    "&&","|","||","%","~","<<",">>","[","]","{",
    "}","\\",".","\?",":","!"];
const digitArray = ['0','1','2','3','4','5','6','7','8','9'];
    //预处理函数,将注释去掉
    fs.writeFileSync("test3.txt",'输出结果如下:\r\n'
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值