A pragmatic benefit of using a dedicated upfront lexing phase is that you don’t couple the subsequent parser with lexical detail. This is useful during early programming language development, when the lexical and syntactic details are still changing frequently
使用专用的前期lexing阶段的一个实际好处是,您不会将后续解析器与词法细节相结合。这在早期编程语言开发期间很有用,因为词法和句法细节仍然经常变化。
参考
- Why separate lexing and parsing?
- https://stackoverflow.com/questions/2842809/lexers-vs-parsers
- https://www.zhihu.com/question/31065265
1. Lexer
github:compile/src/lexer
java实现一个词法分析器,参考link可以识别加法与乘法中的token。
token | 种别码 |
---|---|
EOI | 0 |
SEMI(;) | 1 |
PLUS(+) | 2 |
TIMES(*) | 3 |
LP( ( ) | 4 |
RP( ) ) | 5 |
NUM | 6 |
INT | 7 |
EQ( =) | 8 |
ID | 9 |
运行src/lexer
input:
int a = 1 ;
int b = a + 1;
end
output:
实现思路就是根据那个表,用switch…case…框架和最长匹配原则实现了词法分析。
2.parse
Grammar:
statements -> expression ;
| expression ; statements
expression -> term
|term + expression
term -> factor
| factor * term
factor -> NUM_OR_ID
| LP expression RP
github:compile/src/lexer
语法太过简单,用递归下降算法可得:
input:
1+2+3
end
output
input :
213+123+23989+4546;
34545*233+8980;
end
output: