最近在看编译器的前段,大部分人都是用现有工具去实现,例如经典的Lex/yacc。很少有人会从头自己去手写parser,或者做个parser解析器。看理论不实践,总觉得不踏实,所以我找了些现有的词法分析生成器源码看了看,里面对正则表达式解析器本身的处理,多数都是用了最直观的右递归递归下降算法,也就是说是有回溯的算法。故此我便想用脚本,读取一些简单的自定义语法文本,来自动生成LL预测分析表,现在已经把First集合和Follow集合计算完毕,过两天把分析表输出成c语言的头文件就算大功告成。
Repo地址:https://github.com/rasefon/LLTableGenerator
简单的文法文本如下:
``
token: tPlus, tMul, tLp, tRp, tEnd
token: tId
# nil is predefined keyword.
``
$Start: E
E: T,E1
E1: tPlus,T,E1
E1: nil
T: F,T1
T1: tMul,F,T1
T1: nil
F: F1
F1: tLp,E,tRp
F1: tId
脚本用的是ruby,代码如下:
require 'set'
# 'nil' and '$' are predefined terminal token, 'nil' means empty action and '$' is the end flag of parsing.
$start_lside_rule = ""
$token_list = Set.new
$gram_list = Hash.new
$first_set = Hash.new
$follow_set = Hash.new
def construct_table_model(rule_file_name)
lines = IO.readlines(rule_file_name)
lines = lines.map { |l| l.chomp }
token_def_phase = false
rule_def_phase = false
# temporarily record left side tokens and right tokens as string.
lines.each do |line|
# skip comment
next if "#" == line[0]
# Start token define phase.