语法分析——预测分析程序

LL(1)文法满足下面的条件:

  1. 文法不含左递归
  2. 对于文法中每一个非终结符A的各个产生式的候选首符集两两不相交。即,若A→α1|α2|...|αn,则FIRST(αi)∩FIRST(αj)=∅ ,其中(i≠j)
  3. 对文法中的每个非终结符A,若它存在某个侯选首符集包含ε,则FIRST(A)∩FOLLOW(A)=∅

文法通过左递归消除、消除回溯、提取左公共因子可以满足LL(1)的条件,成为LL(1)文法。

预测分析程序实现步骤:

  1. 构造文法非终结符的FIRST集
  2. 构造文法非终结符的FOLLOW集
  3. 构造文法的预测分析表
  4. 构造控制程序

1、构造文法非终结符的FIRST集

假设A有n个侯选α1,α2,...,αn,即A→α1|α2|...|αn。侯选式α的FIRST集形式化描述:FIRST(α)={a | α\Rightarrowa...,a∈v_{T}},若α\Rightarrowε,则ε∈FIRST(α).。那么FIRST(A) = FIRST(α1)∪FIRST(α2)∪...∪FIRST(αn)

实现:

# 求FIRST(α) 用'@'表示'ε'
    def formula_first(self,str):
        first_keys = self.firstSET.keys()
        str_len = len(str)
        if str_len == 1:
            if str in self.terSymbol or str == '@':return [str]
            elif str in first_keys:
                return self.followSET.get(str)
            else:
                return []
        i = 0
        first_set = []
        while i < str_len - 1:
            if str[i] in self.terSymbol:
                if str[i] not in first_set:
                    first_set.append(str[i])
                return first_set
            if str[i+1] == "'":
                if str[i:i+2] in first_keys:
                    str_firstSet = self.firstSET.get(str[i:i+2])
                    for terSym in str_firstSet:
                        if terSym not in first_set:
                            first_set.append(terSym)
                    if '@' not in str_firstSet:
                        return first_set
                    else:
                        for index in range(len(first_set)):
                            if first_set[index] == '@':
                                del first_set[index]
                                break
                        i += 2
                else:
                    return first_set
            else:
                if str[i] in first_keys:
                    str_firstSet = self.firstSET.get(str[i])
                    for terSym in str_firstSet:
                        if terSym not in first_set:
                            first_set.append(terSym)
                    if '@' not in str_firstSet:
                        return first_set
                    else:
                        for index in range(len(first_set)):
                            if first_set[index] == '@':
                                del first_set[index]
                                break
                        i += 1
                else:
                    return first_set
        # 边界处理
        if i == str_len: # 侯选式最后一个字符是"p'"
            first_set.append('@')
        else: # 侯选式最后一个字符是'p'
            if str[i] in first_keys:
                str_firstSet = self.firstSET.get(str[i])
                for terSym in str_firstSet:
                    if terSym not in first_set:
                        first_set.append(terSym)
                if '@' not in str_firstSet:
                    return first_set
                else:
                    first_set.append('@')
        return first_set

    def first_set(self):
        #终结符的first集
        for terSym in self.terSymbol:
            self.firstSET.update({terSym:[terSym]})
        #非终结符的first集
        change_flag = True
        while change_flag:
            change_flag = False
            for formula in self.formulaSet:
                left = ""
                right = ""
                if formula[1] == "'":
                    left = formula[0:2]
                    right = formula[4:]
                else:
                    left = formula[0]
                    right = formula[3:]
                firstSet = []
                for candidate in right.split("|"):
                    candidate_firstSet = self.formula_first(candidate)
                    for terminate in candidate_firstSet:
                        if terminate not in firstSet:
                            firstSet.append(terminate)
                keys = self.firstSET.keys()
                if left in keys:
                    oldSet = self.firstSET.get(left)
                    for symbol in firstSet:
                        if symbol not in oldSet:
                            change_flag = True
                            oldSet.append(symbol)
                    self.firstSET.update({left: oldSet})
                else:
                    change_flag = True
                    self.firstSET.update({left: firstSet})

2、构造文法非终结符的FOLLOW集

假设S是文法G的开始符号,对于G的非终结符A,定义FOLLOW(A) = {a | S\Rightarrow...Aa...,a∈v_{T}}。特别的,若S\Rightarrow...A,则规定'#'∈FOLLOW(A)。

构造规则:

  1. 对于文法的开始符号S,置#于FOLLOW(S)中;
  2. 若A→αBβ是一个产生式,则把FIRST(β)\{ε}加入FOLLOW(B)中;
  3. 若A→αB是一个产生式,或若A→αBβ是一个产生式且β\Rightarrowε(即ε∈FIRST(β)),则把FOLLOW(A)加入FOLLOW(B)中。

实现:

    def update_follow(self,A,B,β): # A->αBβ
        update_flag = False
        keys = self.followSET.keys()
        follow_B = []
        first_β = self.formula_first(β)
        for terSym in first_β:
            if terSym != '@' and terSym not in follow_B:
                follow_B.append(terSym)
        if '@' in first_β:
            if A in keys:
                follow_A = self.followSET.get(A)
                for terSym in follow_A:
                    if terSym not in follow_B:
                        follow_B.append(terSym)
        if B in keys:
            old_follow_B = self.followSET.get(B)
            for terSym in follow_B:
                if terSym not in old_follow_B:
                    old_follow_B.append(terSym)
                    update_flag = True
            self.followSET.update({B:old_follow_B})
        else:
            update_flag = True
            self.followSET.update({B: follow_B})
        return update_flag

    def follow_set(self):
        self.followSET.update({self.startSymbol:['#']}) # 置'#'于开始符合follow集
        change_flag = True
        while change_flag:
            change_flag = False
            for formula in self.formulaSet:
                left = ""
                right = ""
                if formula[1] == "'":
                    left = formula[0:2]
                    right = formula[4:]
                else:
                    left = formula[0]
                    right = formula[3:]
                for candidate in right.split("|"):
                    if candidate == '@': continue
                    candidate_len = len(candidate)
                    i = 0
                    while i < candidate_len:
                        if candidate[i] in self.terSymbol:
                            i += 1
                            continue
                        if i < candidate_len-1:
                            if candidate[i+1] == "'":
                                change_flag = self.update_follow(left,candidate[i:i+2],candidate[i+2:])
                                i += 2
                            else:
                                change_flag = self.update_follow(left,candidate[i],candidate[i+1:])
                                i += 1
                        else:
                            change_flag = self.update_follow(left,candidate[i],"")
                            i += 1

3、构造预测分析表M

算法:

  1. 对于文法G的每个产生式A→α,执行第2和3;
  2. 对于每个终结符a∈FIRST(α),把A→α加入M[A,a]中;
  3. 若ε∈FIRST(α),对每个终结符b∈FOLLOW(A)把A→α加入M[A,b]中;
  4. 把所有无定义的M[A,a]标上出错标志。

实现:

    def update_table(self,left,char,candidate):
        if char == '@': return
        Mi = self.M.get(left)  # 获取表中的一行
        if Mi:
            Mij = Mi.get(char)  # 获取表中的一个单元
            if Mij:
                if candidate not in Mij:
                    Mij.append(candidate)
            else:
                Mij = [candidate]
            Mi.update({char: Mij})
        else:
            Mi = {char: [candidate]}
        self.M.update({left: Mi})

    def prediction_analysis_table(self):
        for formula in self.formulaSet:
            left = ""
            right = ""
            if formula[1] == "'":
                left = formula[0:2]
                right = formula[4:]
            else:
                left = formula[0]
                right = formula[3:]
            for candidate in right.split("|"):
                first_α = self.formula_first(candidate)
                for a in first_α:
                    self.update_table(left,a,candidate)
                if '@' in first_α:
                    for b in self.followSET.get(left):
                        self.update_table(left, b, candidate)

4、控制程序

预测分析模型:

总控程序总是按照S栈顶符号X和当前输入符号a的情况执行下面的动作:

  1. 若X = a = '#',则表示分析成功,停止分析过程。
  2. 若X = a ≠ '#',则把X从栈S中弹出,输入符号指向a的下一个输入符号。
  3. 若X是一个非终结符,查看预测分析表M。若M[X,a]存放X的一个产生式,那么把X从栈S中弹出,然后将产生式→的右部符号串逆序入栈S(若右部符号是ε,表示没有元素入栈)。若M[X,a]存放出错标志,提示错误,停止分析过程。

实现:

def control_program(S,str,pa):
    S.push('#')
    S.push(pa.startSymbol)
    str_len = len(str)
    i = 0
    while i < str_len:
        X = S.peek()
        if X in pa.terSymbol:
            if X == str[i]: # X==str[i] 不等于‘#’
                S.pop()
                i += 1
            else:
                print("error") # X不等于str[i]
                break
        elif X == '#':
            if X == str[i]:
                print("success")
                break
            else:
                print("error")
                break
        else:
            MI = pa.M.get(X)
            if MI is None:
                print("error")
                break
            else:
                candidate = MI.get(str[i])
                if candidate is not None:
                    S.pop()
                    formula = candidate[0]
                    if formula != '@':
                        index = len(formula) - 1 # 逆序入栈
                        while index >= 0:
                            if formula[index] == "'":
                                S.push(formula[index - 1:index + 1])
                                index -= 2
                            else:
                                S.push(formula[index])
                                index -= 1
                else:
                    print("error")
                    break

 

测试

测试数据(输入串、文法开始符号、文法终结符、文法非终结符、文法产生式):

i*i+i#
E
i + * ( )
E E' T T' F
E->TE' E'->+TE'|@ T->FT' T'->*FT'|@ F->(E)|i

测试结果:

success

  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值