编译原理-LL1文法实现

爱上代码的虫

已于 2024-04-09 15:53:03 修改

阅读量892

点赞数

文章标签： python

于 2023-04-09 15:55:39 首次发布

本文链接：https://blog.csdn.net/qq_43370683/article/details/130043232

版权

实现上下文无关文法的分析（ll1）

近来想回顾下自己在编译原理上的程序，当时也是没日没夜地写这些代码，词法文法分析等等，也是我开始学习python的开始。运行效果见最后。

文法的定义

文法是对语言结构的定义与描述。即从形式上用于描述和规定语言结构的称为“文法”(或称“语法”）

左递归和右递归

左递归

A = Aα | β

右递归

A = αA | β

LL分析流程

LL(1)分析使用显式栈而不是递归调用来完成分析。

提取左因子

左因子容易造成回溯。

方法：A = ab | ac

修正为：A = aA’ A’ = b| c

# 提取左因子
def dealSameFactor(sp):
    i = 0
    newsp = []  # 存储改写的文法
    while i < len(sp):
        s = sp[i]
        for j in range(len(s)):
            if '|' not in s:
                newsp.append(s)
                break
            if s[j] == '>':
                head = s[:j - 1]  # 获取每个推导式的前部分
                tail = s[j + 1:]  # 获取每个推导式的后部分
                group = []  # 存储后部分分割后的各个小节
                group = tail.split('|')
                newgroup = []
                for g in group:
                    newgroup.append(g.strip().split(' '))
                min = 100  # 相同左因子的长度
                flag = 0
                for g in newgroup:
                    if min > len(g):
                        min = len(g)
                m = 0
                while m < len(newgroup) and min > 0:
                    for n in range(min):
                        if newgroup[m][n] != newgroup[0][n]:
                            min = min - 1
                            m = 0
                    m = m + 1
                if min < 1:
                    newsp.append(s)
                    break
                else:
                    tem = head + "->"
                    for k in range(min):
                        tem = tem + ' ' + newgroup[0][k]
                    tem = tem + ' ' + head.rstrip() + "'"
                    newsp.append(tem)
                    tem = head.strip() + "' ->" + ' ' + " ".join(newgroup[0][min:])
                    for k in range(len(newgroup) - 1):
                        if ''.join(newgroup[k + 1][min:]) != '':
                            tem = tem + " |" + ''.join(newgroup[k + 1][min:])
                    if len(newgroup[k + 1]) == min:
                        tem = tem + " | ε"
                    newsp.append(tem)
        i = i + 1
    return newsp

处理左递归

左递归容易出现死循环。下边的代码可以消除一般的左递归。

# 分解并处理左递归
def dealSameLeft(sp):
    flag = 0  # 标记是否存在左递归
    i = 0
    newsp = []  # 存储改写的文法
    while i < len(sp):
        s = sp[i]
        for j in range(len(s)):
            if s[j] == '>':
                head = s[:j - 1]  # 获取每个推导式的前部分
                tail = s[j + 1:]  # 获取每个推导式的后部分
                group = []  # 存储后部分分割后的各个小节
                group = tail.split('|')
                # 去掉每个小节两边的空字符
                # for m in range(len(group)):
                # group[m] = group[m].strip()
                # 检查左递归
                for g in group:
                    if head == g[1:len(head) + 1]:
                        flag = 1
                        break
                # saveg = []  # 保存head的指向，用于消除左递归的替换
                # 文法重组织
                if flag == 0:
                    for g in group:
                        newsp.append(head + "->" + g)
                else:
                    # 对group进行排序，将没有左递归的部分放在前边，保证文法的顺序性
                    l = 0
                    r = len(group) - 1
                    while r > l:
                        while head != group[l][1:len(head) + 1] and r > l:
                            l += 1
                        while head == group[r][1:len(head) + 1] and r > l:
                            r -= 1
                        group[l], group[r] = group[r], group[l]
                    for g in group:
                        if head != g[1:len(head) + 1]:
                            if g[1]!="ε":
                                newsp.append(head + "-> " + g.strip() + ' ' + head.strip() + "' ")
                            else:
                                newsp.append(head + "-> "+ ' ' + head.strip() + "' ")
                            # saveg.append(g+head.strip()+"' ")
                        else:
                            newsp.append(head.strip() + "' ->" + g[len(head):] + head.strip() + "' ")
                    newsp.append(head.strip() + "' ->" + ' ε')
                    flag = 0
                '''
                # 消除间接左递归
                for m in range(len(sp)):
                    if m > i:
                        g = sp[m]
                        divG = g.split("|")
                        for divg in divG:
                            divg_useBlank = divg.split(" ")
                            for divg_useblank in divg_useBlank:
                                if divg_useblank + " " == head:  # 寻找要替换的部分
                                    tem = []
                                    for q in range(len(saveg)):
                                        tem.append(divg.replace(head, saveg[q]))  # 替换
                                    divG.remove(divg)
                                    for q in tem:
                                        divG.append(q)
                        sp[m] = "|".join(divG)
                '''
        i += 1
    return newsp

构造first集合

def toFirst(table, ter, non_ter):
    first = []
    # 初始化空的first集合
    for i in range(len(non_ter)):
        tem = []
        first.append(tem)
    for i in range(len(table)):
        if table[i][1] != table[i][0]:
            if table[i][1] in ter:
                for j in range(len(non_ter)):
                    if non_ter[j] == table[i][0]:
                        first[j].append(table[i][1])
    for k in range(len(table)):
        for i in range(len(table)):
            if table[i][1] != table[i][0]:
                if table[i][1] in non_ter:
                    for j in range(len(non_ter)):
                        if non_ter[j] == table[i][0]:
                            for m in range(len(non_ter)):
                                if non_ter[m] == table[i][1]:
                                    first[j] = joinF(first[j], first[m])
    return first

构造follow集合

# 求follow集合
def toFollow(table, ter, non_ter, first):
    follow = []
    # 初始化空的follow集合
    for i in range(len(non_ter)):
        tem = []
        follow.append(tem)
    follow[0].append('$')
    for i in range(2 * len(non_ter)):  # 循环的次数
        for j in range(len(non_ter)):  # 遍历非终结符
            ch = non_ter[j]
            for m in range(len(table)):  # 遍历推导的结果
                for n in range(len(table[m]) - 1):
                    if table[m][n + 1] == ch:
                        if n + 1 == len(table[m]) - 1:
                            pos = non_ter.index(table[m][0])
                            follow[j] = joinF(follow[j], follow[pos])
                        else:
                            if table[m][n + 2] in ter:
                                if table[m][n + 2] not in follow[j]:
                                    follow[j].append(table[m][n + 2])
                            else:
                                pos = non_ter.index(table[m][n + 2])
                                if 'ε' not in first[pos]:
                                    follow[j] = joinF(follow[j], first[pos])
                                else:
                                    follow[j] = joinF(follow[j], first[pos])
                                    follow[j].remove('ε')
                                    follow[j] = joinF(follow[j], follow[pos])
    return follow

构造分析表

当非终结符A位于分析栈的顶部时，根据当前的输入记号（先行），必须使用刚刚描述过的分析办法做出一个决定：当替换栈中的A时应为A选择哪一个文法规则，相反地，当记号位于栈顶部时，就无需做出这样的决定，这是因为无论它是当前的输入记号（由此就发生一个匹配），还是不是输入记号（从而就发生一个错误）。

# action的获取
def toMove(table, first, follow, ter, non_ter):
    # 初始化move
    move = []
    for i in range(len(non_ter)):
        tem = []
        for j in range(len(ter)):
            tem.append(' ')
        move.append(tem)
    for i in range(len(table)):
        if table[i][1] in ter:  # 第一个是终结符
            if table[i][1] != 'ε':
                x = ter.index(table[i][1])
                y = non_ter.index(table[i][0])
                move[y][x] = i + 1
            else:
                pos = non_ter.index(table[i][0])
                for fo in follow[pos]:
                    if fo == '$':
                        fo = 'ε'
                    x = ter.index(fo)
                    y = non_ter.index(table[i][0])
                    if move[y][x] == ' ':
                        move[y][x] = i + 1
        else:  # 第一个是非终结符
            pos = non_ter.index(table[i][1])
            for f in first[pos]:
                if f != 'ε':
                    x = ter.index(f)
                    y = non_ter.index(table[i][0])
                    move[y][x] = i + 1
                else:
                    for fo in follow[pos]:
                        if fo == '$':
                            fo = 'ε'
                        x = ter.index(fo)
                        y = non_ter.index(table[i][0])
                        move[y][x] = i + 1
    return move

分析过程

# 匹配
def match(self, S):
    tem = "分析栈" + '{0:>82}'.format("输入") + '{0:>40}'.format("动作")
    print(tem)
    A = []
    A.append(non_ter[0])
    B = S.split()
    B.append('$')
    a = A[0]
    b = B[0]
    y = non_ter.index(a)
    x = ter.index(b)
    C = move[y][x]
    i = 1
    while len(A) != 0 and len(B) != 0:
        if C != ' ':
            tem = str(i) + '\t' + '{0:>40}'.format(' '.join(A)) + "$" + '{0:>55}'.format(' '.join(B)) + \
            '{0:>27}'.format(str(C))
            else:
                tem = str(i) + '\t' + '{0:>40}'.format(' '.join(A)) + "$" + '{0:>55}'.format(' '.join(B)) + \
                '{0:>27}'.format("拒绝")
                print(tem)
                return
            print(tem)
            i = i + 1
            # 替换
            if self.table[C - 1][1] == 'ε':
                A.pop(0)
                else:
                    A.pop(0)
                    for j in range(len(self.table[C - 1]) - 1):
                        A.insert(j, self.table[C - 1][j + 1])
                        # 匹配
                        if len(A) != 0:
                            while A[0] == B[0]:
                                tem = str(i) + '\t' + '{0:>40}'.format(' '.join(A)) + "$" + '{0:>55}'.format(' '.join(B)) + \
                                '{0:>27}'.format("匹配")
                                print(tem)
                                i = i + 1
                                A.pop(0)
                                B.pop(0)
                                if len(B) == 0 or len(A) == 0:
                                    break
                                    if len(A) != 0 and len(B) != 0:
                                        a = A[0]
                                        b = B[0]
                                        y = non_ter.index(a)
                                        x = ter.index(b)
                                        C = move[y][x]
                                        if len(A) == 0 and len(B) != 0:
                                            tem = str(i) + '\t' + '{0:>40}'.format(' '.join(A)) + "$" + '{0:>55}'.format(' '.join(B)) + \
                                            '{0:>27}'.format("拒绝")
                                            print(tem)
                                            break
                                            tem = str(i) + '\t' + '{0:>40}'.format(' '.join(A)) + "$" + '{0:>55}'.format(' '.join(B)) + \
                                            '{0:>27}'.format("接受")
                                            print(tem)

程序运行实例

请输入文法及待测表达式：
S -> S * a P | a P | * a P 
P -> + a P | + a 
a + a * a + a

改写后的文法：
1       S -> * a P S'
2       S -> a P S'
3       S' -> * a P S'
4       S' -> ε
5       P -> + a P'
6       P' -> P
7       P' -> ε

first集合:
S                   *         a
S'                  *         ε
P                   +
P'                  ε         +

follow集合:
S                   $
S'                  $
P                   *         $
P'                  *         $

分析表:
M[N,T]              *                   a                   $                   +
S                   1                   2
S'                  3                                       4
P                                                                               5
P'                  7                                       7                   6

分析过程:
分析栈                                                                                输入
        动作
1                                              S$                                        a + a * a + a $
          2
2                                         a P S'$                                        a + a * a + a $
         匹配
3                                           P S'$                                          + a * a + a $
          5
4                                      + a P' S'$                                          + a * a + a $
         匹配
5                                        a P' S'$                                            a * a + a $
         匹配
6                                          P' S'$                                              * a + a $
          7
7                                             S'$                                              * a + a $
          3
8                                       * a P S'$                                              * a + a $
         匹配
9                                         a P S'$                                                a + a $
         匹配
10                                          P S'$                                                  + a $
          5
11                                     + a P' S'$                                                  + a $
         匹配
12                                       a P' S'$                                                    a $
         匹配
13                                         P' S'$                                                      $
          7
14                                            S'$                                                      $
          4
15                                              $                                                      $
         接受
0 退出    1 继续
请输入文法及待测表达式：
E -> T E'
E' -> + T E' | ε
T -> F T'
T' -> * F T'| ε
F -> ( E ) | i
i * i + i * i

改写后的文法：
1       E -> T E'
2       E' -> + T E'
3       E' -> ε
4       T -> F T'
5       T' -> * F T'
6       T' -> ε
7       F -> ( E )
8       F -> i

first集合:
E                   (         i
E'                  +         ε
T                   (         i
T'                  *         ε
F                   (         i

follow集合:
E                   $         )
E'                  $         )
T                   +         $         )
T'                  +         $         )
F                   *         +         $         )

分析表:
M[N,T]              +                   $                   *                   (                   )                   i
E                                                                               1                                       1
E'                  2                   3                                                           3
T                                                                               4                                       4
T'                  6                   6                   5                                       6
F                                                                               7                                       8

分析过程:
分析栈                                                                                输入                                      动作
1                                              E$                                        i * i + i * i $                          1
2                                           T E'$                                        i * i + i * i $                          4
3                                        F T' E'$                                        i * i + i * i $                          8
4                                        i T' E'$                                        i * i + i * i $                         匹配
5                                          T' E'$                                          * i + i * i $                          5
6                                      * F T' E'$                                          * i + i * i $                         匹配
7                                        F T' E'$                                            i + i * i $                          8
8                                        i T' E'$                                            i + i * i $                         匹配
9                                          T' E'$                                              + i * i $                          6
10                                            E'$                                              + i * i $                          2
11                                        + T E'$                                              + i * i $                         匹配
12                                          T E'$                                                i * i $                          4
13                                       F T' E'$                                                i * i $                          8
14                                       i T' E'$                                                i * i $                         匹配
15                                         T' E'$                                                  * i $                          5
16                                     * F T' E'$                                                  * i $                         匹配
17                                       F T' E'$                                                    i $                          8
18                                       i T' E'$                                                    i $                         匹配
19                                         T' E'$                                                      $                          6
20                                            E'$                                                      $                          3
21                                              $                                                      $                         接受