LR分析法(一)
这是编译原理的第三个实验,LR分析法是自下而上的分析法。
LR分析法的基本思想是,在规范规约过程中,一方面记住已移进和规约出的整个符号串,即记住“历史”,另一方面根据所用的产生式推测未来可能碰到的输入符号,即对未来进行“展望”。
实现构造的一些类的存储结构说明
Infinite
虽说叫infinie但存储的是非终结符的相关信息,是从上一个实验中照搬过来的,因为考虑到后面要求SLR会用到FOLLOW与FIRST因此搬过来
class Infinite:
def __init__(self, name):
self.name = name
self.equalList = []
self.FIRST = {}
self.FOLLOW = set()
def equalList_change(self, express):
expresses = express.split('|')
for express in expresses:
self.equalList.append(express)
State
描述状态的类,有转换关系,有其中所包含的项目集,也有标号
class State:
def __init__(self,mark):
self.mark = mark
self.projectSet = {}
self.trans = {}
Project
项目一些性质的类,描述表达式的其实不必要,其产生是因为我之前一个不正确的思路
class Project:
def __init__(self,key,equal):
self.equal = equal
self.key = key
LR分析器
LR分析器实质上是一个带先进后出存储器(栈)的确定有限自动机
LR分析器模型
LR分析法的各类分析法的主控程序都是相同的区别在于分析表。
LR分析器的核心是分析表,下面将重点讲解各类分析表应该如何构造。其他内容不再继续赘述。
LR(0)分析法
LR(0)分析法只概况历史资料而不包含展望。要注意的是之前的LL(1)是以非终结符为中心通过非终结符来进行分析,而LR文法更多的是根据产生式来进行分析,因此要把注意力更多的放在产生式上。下面由于时间紧迫,基础的知识不再讲解直接进入正题。
LR(0)项目集规范族构造
COLOSURE函数
假定I是文法G’的任一项目集,定义和构造I的闭包CLOSURE(I)的办法是:
- I的任一项目都属于CLOSURE(I)
- 若A->α·Bβ属于COLOSURE(I),那么对任何关于B的产生式B->γ,B->·γ属于CLOSURE(I)
- 重复上述操作至不再增大为止
实现代码
def CLOSURE_get(ISet, nameList, project_Relation):
flag = False
for I in set(ISet):
pos = I.equal.index('·')
if pos < len(I.equal) - 1:
if I.equal[pos + 1] in nameList:
for value in project_Relation[0]:
if value.key == I.equal[pos + 1]:
if value not in ISet:
flag = True
ISet.add(value)
return ISet, flag
def CLOSURE(ISet, nameList, project_Relation):
coll, flag = CLOSURE_get({projectSet[0]}, name_list, projectRelation)
while flag:
coll, flag = CLOSURE_get(coll, name_list, projectRelation)
return coll
其实并不是很好写,还是蛮有难度的,关键在于对上述求解思路的理解,反复揣摩
GO函数
GO函数蛮重要的,开始我忽视了这个函数的重要性,认为应该先把所有状态求出来,根据求出的状态再建立GO函数映射,但经过反复尝试这种方法是有问题的,不仅求出的状态可能会有重复以及不符,而且先求状态再建立GO函数的一些映射关系是极为困难的。不得不说,设计还是蛮巧妙的。
GO函数功能:
GO函数是一个状态转换函数,GO(I,X)的第一个变元I是一个项目集,第二个变元X为文法符号,函数值GO(I,X)定义为:
GO(I,X) = COLOSURE(J)
实现代码
def GO(ISet, X, project_Relation, name_list):
"""
Go函数
:param name_list: 非终结符名称
:param project_Relation: 项目关系表
:param ISet:
:param X: 输入文法符号
:return:求解在输入符号后下一个状态集族
"""
new_set = set()
for I in ISet:
pos = I.equal.index('·')
if pos < len(I.equal) - 1:
if I.equal[pos + 1] == X:
for value in project_Relation[pos + 1]:
if value.equal == I.equal[:pos] + I.equal[pos + 1:pos + 2] + '·' + I.equal[pos + 2:]:
new_set.add(value)
return CLOSURE(new_set, name_list, project_Relation)
抽象的很
构造LR(0)项目集规范族
构造算法
此图便是描述构造算法。这你可能觉得我再说废话,但事实是照着这个算法实现确实是能够完成任务的。尽管过程很艰难,但只要用代码把算法每一步都描述清楚,便能够实现功能。但是这个利用代码描述的能力只能靠您们自己慢慢提升,我也无法帮助您。
def LR0_state(states, all_X, state_sets, project_relation, name_List, mark):
flag = False
len1 = len(states) # 记录states长度
for state in set(states):
for X in all_X:
closure_set = GO(state.projectSet, X, project_relation, name_List)
if closure_set and closure_set not in state_sets:
mark += 1
new_state = State(mark)
new_state.projectSet = closure_set
state.trans[X] = new_state
states.append(new_state)
state_sets.append(closure_set)
flag = True
elif closure_set in state_sets:
for value in set(states):
if value.projectSet == closure_set:
state.trans[X] = value
flag = True
len2 = len(states)
if len1 == len2:
flag = False
return flag, mark
def LR0_state_get(projectSet, name_List, project_relation, all_X):
states = []
mark = 0
state_sets = []
ISet = CLOSURE({projectSet[0]}, name_List, project_relation)
new_state = State(mark)
new_state.projectSet = ISet
state_sets.append(ISet)
states.append(new_state)
flag = True
while flag:
flag, mark = LR0_state(states, all_X, state_sets, project_relation, name_List, mark)
for state in states:
print("-----")
print("状态标号:", end="")
print(state.mark)
print("状态集:")
for project in state.projectSet:
print(project.key + ":" + project.equal,end=" ")
print("转换关系:")
for key in state.trans.keys():
print(key + ":" + str(state.trans[key].mark),end=",")
print(" ")
print("-----")
print("状态长度:", end="")
print(len(states))
return states
效果展示
经与老师样例对比没问题
构造LR(0)分析表
分析表构造方法
经过一番努力调试终于我完成了这个艰难的过程下面是代码
实现代码
def LR0_table_create(states, all_X, nameList, ruleSet):
ACTION = {}
GOTO = {} # 均使用字典存储
print(ruleSet)
for state in states:
if len(state.projectSet) == 1:
projects = list(state.projectSet)
if projects[0].key == nameList[-1]:
ACTION.setdefault(state.mark, {})
ACTION[state.mark]['#'] = "acc"
continue
project = projects[0]
s = project.key + "->" + project.equal[:-1]
pos = ruleSet.index(s)
for X in all_X:
if X not in nameList:
ACTION.setdefault(state.mark, {})
ACTION[state.mark][X] = 'r' + str(pos + 1)
else:
for key in state.trans.keys():
if key in nameList:
GOTO.setdefault(state.mark, {})
GOTO[state.mark][key] = state.trans[key].mark
else:
ACTION.setdefault(state.mark, {})
ACTION[state.mark][key] = 's' + str(state.trans[key].mark)
return ACTION, GOTO
运行截图
至此LR(0)分析表的构造就已经完成了,由于分析表是最主要的而语法分析总控程序都类似因此我们继续快马加鞭来进行下面SLR分析表的构造
SLR分析法
问题描述:
看这张图的这一个状态当面对这一个情况时,若输入符号为’*'则直接进行移进,但如果遇到其他符号的时候究竟应该移进还是规约呢?这个是LR(0)不能解决的问题,因此才会有SLR分析。
这个比较简单只需判定是否属于FOLLOW集即可因此不过多阐述
实现代码
def FIRST_next(infinite, name, get_record):
"""
求解FIRST集使用的递归函数
:param infinite: 终结符字典
:param name: 要求解的终结符
:param get_record: 记录是否已经求解过
:return: None
"""
if get_record[name]:
return
for equal in infinite[name].equalList:
if equal[0].islower() or equal[0] in op:
infinite[name].FIRST.setdefault(equal[0], [])
infinite[name].FIRST[equal[0]].append(equal)
elif equal == 'ε':
infinite[name].FIRST.setdefault(equal, [])
infinite[name].FIRST['ε'].append(equal)
else:
if not get_record[equal[0]] and equal[0] != name: # 没有求解则需要递归求解 之前之所以没有这一步是因为之前进行了左递归的消除
FIRST_next(infinite, equal[0], get_record)
# 判断是否含有空串
if 'ε' in infinite[equal[0]].FIRST: # 含有时则需要去除空串
new_set = set(infinite[equal[0]].FIRST)
new_set.discard('ε')
# infinite[name].FIRST=infinite[name].FIRST.union(new_set)
for key in infinite[equal[0]].FIRST.keys():
infinite[name].FIRST.setdefault(key, [])
infinite[name].FIRST[key].append(equal)
for new_name in equal[1:]:
if new_name.islower() or equal[0] in op:
infinite[name].FIRST.setdefault(new_name, [])
infinite[name].FIRST[new_name].append(equal)
break
else:
FIRST_next(infinite, new_name, get_record)
if 'ε' not in infinite[equal[0]].FIRST:
for key in infinite[equal[0]].FIRST.keys():
infinite[name].FIRST.setdefault(key, [])
infinite[name].FIRST[key].append(equal)
break
else:
new1_set = set(infinite[equal[0]].FIRST)
new1_set.discard('ε')
for key in new1_set:
infinite[name].FIRST.setdefault(key, [])
infinite[name].FIRST[key].append(equal)
continue
else:
for key in infinite[equal[0]].FIRST.keys():
infinite[name].FIRST.setdefault(key, [])
infinite[name].FIRST[key].append(equal)
get_record[name] = True
def FIRST_get(infinite):
"""
求解FIRST集
:param infinite:
:return:返回更新后的字典
"""
get_record = {} # 用来记录那些非终结符的FIRST集已经被求解
name_list = list(infinite.keys())
for name in name_list:
get_record[name] = False
for name in name_list:
FIRST_next(infinite, name, get_record)
print("各个非终结符的FIRST集:")
for name in name_list:
print(name, end=":")
print(infinite[name].FIRST)
def FOLLOW_next(infinite, name, get_record, name_list):
if get_record[name]: # 已被求结果则直接返回
return
for find_name in name_list:
for equal in infinite[find_name].equalList:
if name in equal: # 要求解的非终结符在产生式中
index = equal.index(name)
if index == len(equal) - 1 and find_name != name: # 在末尾位置
FOLLOW_next(infinite, find_name, get_record, name_list) # 递归调用求解FOLLOW集
infinite[name].FOLLOW = infinite[name].FOLLOW.union(infinite[find_name].FOLLOW)
elif index < len(equal) - 1 and (equal[index + 1].islower() or equal[index + 1] in op): # 紧接着的为终结符
infinite[name].FOLLOW.add(equal[index + 1])
else: # 紧接着的是非终结符
if index < len(equal) - 1:
if 'ε' not in set(infinite[equal[index + 1]].FIRST.keys()):
keys = set(infinite[equal[index + 1]].FIRST.keys())
for key in keys:
infinite[name].FOLLOW.add(key)
else: # 空串在内的情况
pos = index + 1
first = list(infinite[equal[pos]].FIRST.keys()) # 存储first集合
while 'ε' in first:
first.remove('ε')
for key in first:
infinite[name].FOLLOW.add(key)
if equal[pos].islower() or equal[pos] in op:
infinite[name].FOLLOW.add(equal[pos])
break
if pos + 1 >= len(equal):
break
else:
pos += 1
first = list(infinite[equal[pos]].FIRST.keys())
if pos + 1 == len(equal):
first = list(infinite[equal[pos]].FIRST.keys())
if 'ε' in first:
FOLLOW_next(infinite, find_name, get_record, name_list) # 递归调用求解FOLLOW集
infinite[name].FOLLOW = infinite[name].FOLLOW.union(infinite[find_name].FOLLOW)
get_record[name] = True
def FOLLOW_get(infinite, start=-1):
get_record = {}
name_list = list(infinite.keys())
for name in name_list:
get_record[name] = False
if start == -1: # 说明没有自定义开始符号,默认以第一个符号为开始符号
infinite[name_list[-1]].FOLLOW.add('#')
for name in name_list:
FOLLOW_next(infinite, name, get_record, name_list)
print("各个非终结符的FOLLOW集:")
for name in name_list:
print(name, end=":")
print(infinite[name].FOLLOW)
def SLR_table_create(states, all_X, nameList, ruleSet, infinites):
"""
生成SLR分析表
:param states: 状态集
:param all_X: 所有符号
:param nameList: 非终结符列表
:param ruleSet: 产生式集合
:param infinites: 非终结符集
:return: SLR分析表
"""
ACTION = {}
GOTO = {} # 均使用字典存储
print(ruleSet)
for state in states:
if len(state.projectSet) == 1:
projects = list(state.projectSet)
if projects[0].key == nameList[-1]:
ACTION.setdefault(state.mark, {})
ACTION[state.mark].setdefault('#', [])
ACTION[state.mark]['#'].append("acc")
continue
project = projects[0]
s = project.key + "->" + project.equal[:-1]
pos = ruleSet.index(s)
for X in all_X:
if X not in nameList and X in infinite[project.key].FOLLOW:
ACTION.setdefault(state.mark, {})
ACTION[state.mark].setdefault(X, [])
ACTION[state.mark][X].append('r' + str(pos + 1))
else:
for key in state.trans.keys():
if key in nameList:
GOTO.setdefault(state.mark, {})
GOTO[state.mark].setdefault(key, [])
GOTO[state.mark][key].append(state.trans[key].mark)
else:
ACTION.setdefault(state.mark, {})
ACTION[state.mark].setdefault(key, [])
ACTION[state.mark][key].append('s' + str(state.trans[key].mark))
return ACTION, GOTO
要注意的是此处求解FIRST集时由于未消除左递归因此在求解时要防止产生死循环。