编译原理之路(四)第四章语法分析典型习题解答

• 4.2.1(1-3),4.2.2(1-5),4.2.3(1-3)
• 4.3.1
• 4.4.1(1-5),4.4.3, 4.4.4,
• 4.5.1
• 4.6.2, 4.6.5, 4.6.6,
• 4.7.4,4.7.5

4.2.1(1-3)

最左推导最右推导
S=> SS* =>SS+S* =>aS+S*=>aa+S*=>aa+a*S=>SS*=>Sa*=>SS+a*=>Sa+a*=>aa+a*

最左推导语法树如下
在这里插入图片描述

4.2.2(1-5) 从文法推导串

这个比较简单,可以通过肉眼看出,对于相对复杂的,也可以拆解串,逆推得到

序号最左推导最右推导
1S=>0S1=>00S11=>000111S=>0S1=>00S11=>000111
2S=>+SS=>+ *SS S=>+*aSS=>+*aaS->+*aaaS=>+SS=>+Sa=>+ *SS a=>+*Saa=>+*aaa
3S=>S(S)S=>S( S(S)S )S=>S( S(S)S (S)S)S=>(()())S=>S(S)S=>S( S(S)S )S=>S( S(S) S(S)S)S=>(()())
4S=>SS=>S*S=>(S)*S=>(S+S)*S=>(a+a)*S=>(a+a)*aS=>SS=>Sa=>S*a=>(S)*a=>(S+S)*a=>(a+a)*a
5S=>(L)=>(L,S)=>( L,S ,S)=>(S,S,S)=>((L),S,S)=>((L,S),S,S)=>((S,S),S,S)=>((a,a),S,S)=>((a,a),a,(L))=>((a,a),a,(S))=>((a,a),a,(a)S=>(L)=>(L,S)=>(L,(L))=>(L,(S))=>(L,(a))=>(L,S,(a))=>(L,a,(a))=>(S,a,(a))=>((L),a,a)=>((L,S),a,a)=>((L,a),a,a)=>((S,a),a,a)=>((a,a),a,(a))

以下均为最左推导语法树

2.

3.(由于画图软件无法打出 ϵ \epsilon ϵ,用null代替)在这里插入图片描述
4
在这里插入图片描述
5.
在这里插入图片描述


4.2.3(1-3) 从串推导文法

1.S->(0?1)*

"所有"表示文法最外层一定有*
"0后至少有一个1"表示0可有可无

2.S->0S0|1S1|1|0| ϵ \epsilon ϵ

回文一般用递归考虑

3.S->0S1S|1S0S| ϵ \epsilon ϵ

保证递归内部每次插入一个0时必然有1插入,反之亦然

4.3.1 消除左递归

1.提左公因子方法
在这里插入图片描述
此文法并没有左公因子,但是我们先将其消除左递归,套公式即可
rexpr->rterm|rexpr’
rexpr’->+rterm rexpr’| ϵ \epsilon ϵ

rterm->rfactor|rterm’
rterm’-> rfactor rterm’| ϵ \epsilon ϵ

rfactor->rprimary|rfactor’
rfactor’->* rfactor’| ϵ \epsilon ϵ

消除完左递归后,是可以自顶向下语法分析的


4.4.1(1-5)

在4.4.4中

4.4.3

求first 通过产生式第一个非终结符最左字符开始添加,如果这个非终结符可以为空,继续找第二个
求follow 找到含有目标非终结符的右部,其follow就是右边第一个非终结符的first,如果右边可以为空,就可以加上左部的follow

S->SS+|SS*|a

first(S)={a}
follow(S)={+,*,$}

4.4.4

预测分析器和预测分析表 与 first和follow集合

方法(这是我自己的简略总结,可能不容易看懂,真正要理解还是要看书做题哦):

  1. 提左公因式,消除左递归
  2. 计算每个非终结符的first(自底向上,追踪连续查找,右部一个非终结符含 ϵ \epsilon ϵ就可以继续添加下一个非终结符的first),follow(自顶向下,查看右部含目标非终结符的,加上对应first,开始符号一定有一个$)
  3. 从左到右,按行填表,有first先first,first含 ϵ \epsilon ϵ就找follow,如果自己的first()含 ϵ \epsilon ϵ且follow()含$,还要填到列为$中去

1

S->0S1|01
1.提左公因子
S->0A
A->S1|1
2.消除左递归,带入得
S->0A
A->0A1|1

firstfollow
first(S)={0}first(A)={0,1}
follow(S)={$}follow(A)={$}

第一列表示非终结符,其他列为输入符

非终结符01$
SS->0A
AA->0A1A->1

2

S->+SS|*SS|a
无左公因子无左递归

firstfollow
first(S)={+,*,a}follow(S)={$}
非终结符+*a$
SS->+SSS->*SSS->a

3

S->S(S)S| ϵ \epsilon ϵ
消除左递归
S-> ϵ \epsilon ϵ|S’
S’->(S)SS’| ϵ \epsilon ϵ

也就是
S->S’
S’->(S)SS’| ϵ \epsilon ϵ

firstfollow
first(S)={(,), ϵ \epsilon ϵ}follow(S)={$}
first(S’)={(,), ϵ \epsilon ϵ}follow(S’)={$}
非终结符()$
SS->S’S->S’S->S’
S’S->(S)SS’ | ϵ \epsilon ϵS’-> ϵ \epsilon ϵS’-> ϵ \epsilon ϵ

4

S->S+S|SS|(S)|S*|a
提左公因子
S->SA|(S)|a
A->+S|S|*

当终结符含两个以上时,用一个非终结符表示
S->SA|T
A->+S|S|*
T->(S)|a

消除左递归
S->TS’
S’->AS’| ϵ \epsilon ϵ
A->+S|TS’|*
T->(S)|a

这个难度比之前要高很多,要多加小心,注意做的顺序是first从下往上填,follow从上往下填,先first再follow
但是求出来follow都一样,感觉有问题……

firstfollow
first(S)={(,a}follow(S)={+,(,),a,*,$}
first(S’)={+,(,a,*, ϵ \epsilon ϵ}follow(S’)={+,(,),a,*,$}
first(A)={+,(,a,*}follow(A)={+,(,),a,*,$}
first(T)={(,a}follow(T)={+,(,),a,*$}
非终结符a+*()$
SS->TS’S->TS’
S’S’->AS’S’->AS’S’->AS’S’->AS’S’-> ϵ \epsilon ϵS’-> ϵ \epsilon ϵ
AA->TS’A->+SA->*A->TS’
TT->aT->(S)

5

S->(L)|a
L->L,S|S

消除左递归
S->(L)|a
L->SL’
L’->,SL’| ϵ \epsilon ϵ

firstfollow
first(S)={(,a}follow(S)->{,,),$}
first(L)->{(,a}follow(L)->{)}
first(L’)->{, ϵ \epsilon ϵ}follow(L’)->{)}
非终结符(,)a$
SS->(L)S->a
LL->SL’L->SL’
L’L’->,SL’-> ϵ \epsilon ϵ

4.5.1

如图所示,在将字符压入栈中,通过移入规约方法转换为文法时,出现在栈顶的符号就是句柄
在这里插入图片描述

1如下图 在000111中的第一个1入栈时,弹出01 然后压入S
所以句柄为01
在这里插入图片描述
2同理 句柄为0S1
在这里插入图片描述


4.6.2

S->SS+|SS*|a
提左公因子,保留,不同的用另一个非终结符替代,一次性提完
S->SSA|a
A->+|*

然后消除左递归,注意这里是SS,套公式,我们把第二个S看着终结符
S->aS’
S’->SAS’| ϵ \epsilon ϵ
A->+|*
此时S与S’循环递归了,我们把一代入二,同时
对于开始符号S,要加一个S’->S,只有这样,才能表示进入接受状态,如果只有S出现,由于左递归,S可以在很多地方出现,所以不表示接受状态,为了与上面的S’区分,我们换成S’’
所以答案为
S’‘->S
S->aS’
S’->aS’AS’| ϵ \epsilon ϵ
A->+|*

4.6.5

在这里插入图片描述

判断LL(1)文法的标准
在这里插入图片描述
first(AaAb)={A}和first(BaBb)={B}无交集,且不为 ϵ \epsilon ϵ,所以是LL(1)文法

又因为follow(A)=follow(B)={a,b},当移入a或者b时,就会发生规约冲突所以不是SLR(1)
从另一个角度也可以解释,因为A->. B->. 所以存在规约冲突

4.6.6

在这里插入图片描述
有左递归,显然不是LL(1)的
是SLR(1)文法,因为没有找到冲突


4.7.4

在这里插入图片描述
从文法中容易看出,当输入符为d时,将发生移入规约冲突
SLR(1)将查看可规约的非终结符的follow,判定规约对象,
S->d.c
follow(A)={a,c)
两者含有c,所以SLR(1)文法不能解决

但是他是LR(1)文法,在SLR(1)基础上增加向前搜索符
项目集变成 {[A->d.,a],[A->d.c,$},此时已经没有冲突

现在判断是否是LALR(1),即在LR(1)的基础上合并同心集(项目集相同,向前搜索法不同的状态集),如果不产生冲突,就是LALR(1)文法

做出对应的LR项目集,然后画出DFA(图中有一处错误,I4和I2应该合并为一个状态,但是已经画了,就不改了,不影响正确性)
在这里插入图片描述
从DFA可以看到,并不存在冗余项,所以他也是LALR(1)文法

4.7.5


先做出对应的LR(1)项集,然后画出DFA
在这里插入图片描述

如图所示,发现I5和I9是一样的,但是显然,他们通过不同的路径得到,如果合并的话一定会有冲突,所以不是LALR(1)文法

编译原理龙书答案 完整性高 第二章 2.2 Exercises for Section 2.2 2.2.1 Consider the context-free grammar: S -> S S + | S S * | a Show how the string aa+a* can be generated by this grammar. Construct a parse tree for this string. What language does this grammar generate? Justify your answer. answer S -> S S * -> S S + S * -> a S + S * -> a a + S * -> a a + a * L = {Postfix expression consisting of digits, plus and multiple signs} 2.2.2 What language is generated by the following grammars? In each case justify your answer. S -> 0 S 1 | 0 1 S -> + S S | - S S | a S -> S ( S ) S | ε S -> a S b S | b S a S | ε ⧗ S -> a | S + S | S S | S * | ( S ) answer L = {0n1n | n>=1} L = {Prefix expression consisting of plus and minus signs} L = {Matched brackets of arbitrary arrangement and nesting, includes ε} L = {String has the same amount of a and b, includes ε} ? 2.2.3 Which of the grammars in Exercise 2.2.2 are ambiguous answer No No Yes Yes Yes 2.2.4 Construct unambiguous context-free grammars for each of the following languages. In each case show that your grammar is correct. Arithmetic expressions in postfix notation. Left-associative lists of identifiers separated by commas. Right-associative lists of identifiers separated by commas. Arithmetic expressions of integers and identifiers with the four binary operators +, - , *, /. answer 1. E -> E E op | num 2. list -> list , id | id 3. list -> id , list | id 4. expr -> expr + term | expr - term | term term -> term * factor | term / factor | factor factor -> id | num | (expr) 5. expr -> expr + term | expr - term | term term -> term * unary | term / unary | unary unary -> + factor | - factor factor - > id | num | (expr) 2.2.5 Show that all binary strings generated by the following grammar have values divisible by 3. Hint. Use induction on the number of nodes in a parse tree. num -> 11 | 1001 | num 0 | num num Does the grammar generate all binary strings with values divisible by 3? answer prove any string derived from the grammar can be considered to be a sequence consisting of 11, 1001 and 0, and not prefixed with 0. the sum of this string is: sum = Σn (21 + 20) * 2 n + Σm (23 + 20) * 2m = Σn 3 * 2 n + Σm 9 * 2m It is obviously can divisible by 3. No. Consider string "10101", it is divisible by 3, but cannot derived from the grammar. Question: any general prove? 2.2.6 Construct a context-free grammar for roman numerals. Note: we just consider a subset of roman numerals which is less than 4k. answer wikipedia: Roman_numerals via wikipedia, we can categorize the single noman numerals into 4 groups: I, II, III | I V | V, V I, V II, V III | I X then get the production: digit -> smallDigit | I V | V smallDigit | I X smallDigit -> I | II | III | ε and we can find a simple way to map roman to arabic numerals. For example: XII => X, II => 10 + 2 => 12 CXCIX => C, XC, IX => 100 + 90 + 9 => 199 MDCCCLXXX => M, DCCC, LXXX => 1000 + 800 + 80 => 1880 via the upper two rules, we can derive the production: romanNum -> thousand hundred ten digit thousand -> M | MM | MMM | ε hundred -> smallHundred | C D | D smallHundred | C M smallHundred -> C | CC | CCC | ε ten -> smallTen | X L | L smallTen | X C smallTen -> X | XX | XXX | ε digit -> smallDigit | I V | V smallDigit | I X smallDigit -> I | II | III | ε 2.3 Exercises for Section 2.3 2.3.1 Construct a syntax-directed translation scheme that trans­ lates arithmetic expressions from infix notation into prefix notation in which an operator appears before its operands; e.g. , -xy is the prefix notation for x - y . Give annotated parse trees for the inputs 9-5+2 and 9-5*2.。 answer productions: expr -> expr + term | expr - term | term term -> term * factor | term / factor | factor factor -> digit | (expr) translation schemes: expr -> {print("+")} expr + term | {print("-")} expr - term | term term -> {print("*")} term * factor | {print("/")} term / factor | factor factor -> digit {print(digit)} | (expr) 2.3.2 Construct a syntax-directed translation scheme that trans­ lates arithmetic expressions from postfix notation into infix notation. Give annotated parse trees for the inputs 95-2* and 952*-. answer productions: expr -> expr expr + | expr expr - | expr expr * | expr expr / | digit translation schemes: expr -> expr {print("+")} expr + | expr {print("-")} expr - | {print("(")} expr {print(")*(")} expr {print(")")} * | {print("(")} expr {print(")/(")} expr {print(")")} / | digit {print(digit)} Another reference answer E -> {print("(")} E {print(op)} E {print(")"}} op | digit {print(digit)} 2.3.3 Construct a syntax-directed translation scheme that trans­ lates integers into roman numerals answer assistant function: repeat(sign, times) // repeat('a',2) = 'aa' translation schemes: num -> thousand hundred ten digit { num.roman = thousand.roman || hundred.roman || ten.roman || digit.roman; print(num.roman)} thousand -> low {thousand.roman = repeat('M', low.v)} hundred -> low {hundred.roman = repeat('C', low.v)} | 4 {hundred.roman = 'CD'} | high {hundred.roman = 'D' || repeat('X', high.v - 5)} | 9 {hundred.roman = 'CM'} ten -> low {ten.roman = repeat('X', low.v)} | 4 {ten.roman = 'XL'} | high {ten.roman = 'L' || repeat('X', high.v - 5)} | 9 {ten.roman = 'XC'} digit -> low {digit.roman = repeat('I', low.v)} | 4 {digit.roman = 'IV'} | high {digit.roman = 'V' || repeat('I', high.v - 5)} | 9 {digit.roman = 'IX'} low -> 0 {low.v = 0} | 1 {low.v = 1} | 2 {low.v = 2} | 3 {low.v = 3} high -> 5 {high.v = 5} | 6 {high.v = 6} | 7 {high.v = 7} | 8 {high.v = 8} 2.3.4 Construct a syntax-directed translation scheme that trans­ lates roman numerals into integers. answer productions: romanNum -> thousand hundred ten digit thousand -> M | MM | MMM | ε hundred -> smallHundred | C D | D smallHundred | C M smallHundred -> C | CC | CCC | ε ten -> smallTen | X L | L smallTen | X C smallTen -> X | XX | XXX | ε digit -> smallDigit | I V | V smallDigit | I X smallDigit -> I | II | III | ε translation schemes: romanNum -> thousand hundred ten digit {romanNum.v = thousand.v || hundred.v || ten.v || digit.v; print(romanNun.v)} thousand -> M {thousand.v = 1} | MM {thousand.v = 2} | MMM {thousand.v = 3} | ε {thousand.v = 0} hundred -> smallHundred {hundred.v = smallHundred.v} | C D {hundred.v = smallHundred.v} | D smallHundred {hundred.v = 5 + smallHundred.v} | C M {hundred.v = 9} smallHundred -> C {smallHundred.v = 1} | CC {smallHundred.v = 2} | CCC {smallHundred.v = 3} | ε {hundred.v = 0} ten -> smallTen {ten.v = smallTen.v} | X L {ten.v = 4} | L smallTen {ten.v = 5 + smallTen.v} | X C {ten.v = 9} smallTen -> X {smallTen.v = 1} | XX {smallTen.v = 2} | XXX {smallTen.v = 3} | ε {smallTen.v = 0} digit -> smallDigit {digit.v = smallDigit.v} | I V {digit.v = 4} | V smallDigit {digit.v = 5 + smallDigit.v} | I X {digit.v = 9} smallDigit -> I {smallDigit.v = 1} | II {smallDigit.v = 2} | III {smallDigit.v = 3} | ε {smallDigit.v = 0} 2.3.5 Construct a syntax-directed translation scheme that trans­ lates postfix arithmetic expressions into equivalent prefix arithmetic expressions. answer production: expr -> expr expr op | digit translation scheme: expr -> {print(op)} expr expr op | digit {print(digit)} Exercises for Section 2.4 2.4.1 Construct recursive-descent parsers, starting with the follow­ ing grammars: S -> + S S | - S S | a S -> S ( S ) S | ε S -> 0 S 1 | 0 1 Answer 1) S -> + S S | - S S | a void S(){ switch(lookahead){ case "+": match("+"); S(); S(); break; case "-": match("-"); S(); S(); break; case "a": match("a"); break; default: throw new SyntaxException(); } } void match(Terminal t){ if(lookahead = t){ lookahead = nextTerminal(); }else{ throw new SyntaxException() } } 2) S -> S ( S ) S | ε void S(){ if(lookahead == "("){ S(); match("("); S(); match(")"); S(); } } 3) S -> 0 S 1 | 0 1 void S(){ switch(lookahead){ case "0": match("0"); S(); match("1"); break; case "1": // match(epsilon); break; default: throw new SyntaxException(); } } Exercises for Section 2.6 2.6.1 Extend the lexical analyzer in Section 2.6.5 to remove com­ ments, defined as follows: A comment begins with // and includes all characters until the end of that line. A comment begins with /* and includes all characters through the next occurrence of the character sequence */. 2.6.2 Extend the lexical analyzer in Section 2.6.5 to recognize the relational operators <, =, >. 2.6.3 Extend the lexical analyzer in Section 2.6.5 to recognize float­ ing point numbers such as 2., 3.14, and . 5. Answer Source code: commit 8dd1a9a Code snippet(src/lexer/Lexer.java): public Token scan() throws IOException, SyntaxException{ for(;;peek = (char)stream.read()){ if(peek == ' ' || peek == '\t'){ continue; }else if(peek == '\n'){ line = line + 1; }else{ break; } } // handle comment if(peek == '/'){ peek = (char) stream.read(); if(peek == '/'){ // single line comment for(;;peek = (char)stream.read()){ if(peek == '\n'){ break; } } }else if(peek == '*'){ // block comment char prevPeek = ' '; for(;;prevPeek = peek, peek = (char)stream.read()){ if(prevPeek == '*' && peek == '/'){ break; } } }else{ throw new SyntaxException(); } } // handle relation sign if("".indexOf(peek) > -1){ StringBuffer b = new StringBuffer(); b.append(peek); peek = (char)stream.read(); if(peek == '='){ b.append(peek); } return new Rel(b.toString()); } // handle number, no type sensitive if(Character.isDigit(peek) || peek == '.'){ Boolean isDotExist = false; StringBuffer b = new StringBuffer(); do{ if(peek == '.'){ isDotExist = true; } b.append(peek); peek = (char)stream.read(); }while(isDotExist == true ? Character.isDigit(peek) : Character.isDigit(peek) || peek == '.'); return new Num(new Float(b.toString())); } // handle word if(Character.isLetter(peek)){ StringBuffer b = new StringBuffer(); do{ b.append(peek); peek = (char)stream.read(); }while(Character.isLetterOrDigit(peek)); String s = b.toString(); Word w = words.get(s); if(w == null){ w = new Word(Tag.ID, s); words.put(s, w); } return w; } Token t = new Token(peek); peek = ' '; return t; } Exercises for Section 2.8 2.8.1 For-statements in C and Java have the form: for ( exprl ; expr2 ; expr3 ) stmt The first expression is executed before the loop; it is typically used for initializ­ ing the loop index. The second expression is a test made before each iteration of the loop; the loop is exited if the expression becomes O. The loop itself can be thought of as the statement {stmt expr3 ; }. The third expression is executed at the end of each iteration; it is typically used to increment the loop index. The meaning of the for-statement is similar to expr1 ; while ( expr2 ) {stmt expr3 ; } Define a class For for for-statements, similar to class If in Fig. 2.43. Answer class For extends Stmt{ Expr E1; Expr E2; Expr E3; Stmt S; public For(Expr expr1, Expr expr2, Expr expr3, Stmt stmt){ E1 = expr1; E2 = expr2; E3 = expr3; S = stmt; } public void gen(){ E1.gen(); Label start = new Lable(); Lalel end = new Lable(); emit("ifFalse " + E2.rvalue().toString() + " goto " + end); S.gen(); E3.gen(); emit("goto " + start); emit(end + ":") } } 2.8.2 The programming language C does not have a boolean type. Show how a C compiler might translate an if-statement into three-address code. Answer Replace emit("isFalse " + E.rvalue().toString() + " goto " + after); with emit("ifNotEqual " + E.rvalue().toString() + " 0 goto " + after); or emit("isNotEqualZero " + E.rvalue().toString() + " goto " + after);
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值