音乐是人生的艺术
1 移进/规约
When yacc processes a parser, it creates a set of states each of which reflects a possible position in one or more partially parsed rules.
As the parser reads tokens, each time it reads a token that doesn’t complete a rule it pushes the token on an internal stack and switches to a new state reflecting the token it just read. This action is called a shift.
When it has found all the symbols that constitute the right-hand side of a rule, it pops the right-hand side symbols off the stack, pushes the left-hand side symbol onto the stack, and switches to a new state reflecting the new symbol on the stack. This action is called a reduction, since it usually reduces the number of items on the stack. (Not always, since it is possible to have rules with empty right-hand sides.)
Whenever yacc reduces a rule, it executes user code associated with the rule. This is how you actually do something with the material that the parser parses.
Often the action code builds a parse tree corresponding to the input, so that later code can process a whole statement or even a whole program at a time。----- crow_bar语言就是这个样子的。
《lex & yacc》书上举的一个有 移进—规约 冲突的例子:
Shift指的是将一个token加入堆栈,reduce指的是将堆栈中现存的进行一次替换:
Parser是可以提前向后看一个token的,这时它会提前看到3后面的*,从而有下面的两种选择:
书中给出的解决方案有两个,一个是利用yacc的语法规则去强行指定运算的优先级(书中不推荐使用这种),另一种是用下面类似分层的方法:
为什么用上面这种方法改写了之后,就可以让乘法的优先级更高而消掉不确定性,需要进一步探索yacc。
实验:用下面的例子没有移进-规约冲突,乘法优先级别高于加法的例子做实验:
编译运行后,输入2+3*4 计算过程:
从运行结果来看,parser接受到3之后,将3规约为term之后,就不继续将其规约为expression了,为什么呢?这是yacc的一个语法规则?我还没发现?
有帖子说 yacc先规约小的规则:https://blog.csdn.net/nosources/article/details/38943265 , term向对于expression是更小更底层的规则。
另,书《lex and yacc》的第8章中有对移进规约冲突的进一步解释,已经看了一部分,继续看可能有收获。
实验用.y文件:
%{
#include <stdio.h>
#include <stdlib.h>
#define YYDEBUG 1
%}
%union {
int int_value;
double double_value;
}
%token <double_value> DOUBLE_LITERAL
%token ADD SUB MUL DIV CR
%type <double_value> expression term primary
%%
expression
:expression ADD term
{
$$ = $1 + $3;
printf("expression: expression ADD term %lf + %lf = %lf\n", $1, $3, $$);
}
| expression SUB term
{
$$ = $1 - $3;
}
| term
{
printf("expression: term %lf\n", $$);
}
;
term
:term MUL primary
{
$$ = $1 * $3;
printf("term: term MUL primary %lf * %lf = %lf\n", $1, $3, $$);
}
| primary{printf("term: primary %lf\n",$$);}
;
primary
:
DOUBLE_LITERAL{printf("primary: DOUBLE_LITERAL %lf\n",$$);}
%%
int
yyerror(char const *str)
{
extern char *yytext;
//fprintf(stderr, "parser error near %s\n", yytext);
return 0;
}
int main(void)
{
extern int yyparse(void);
extern FILE *yyin;
yyin = stdin;
if (yyparse()) {
fprintf(stderr, "Error ! Error ! Error !\n");
exit(1);
}
}
实验用.l文件:
%{
#include <stdio.h>
#include "calcul.tab.h"
int
yywrap(void)
{
return 1;
}
%}
%%
"+" {printf("flex found + \n");return ADD;}
"-" return SUB;
"*" {printf("flex found * \n");return MUL;}
"/" return DIV;
"\n" return CR;
([1-9][0-9]*)|0|([0-9]+\.[0-9]*) {
double temp;
sscanf(yytext, "%lf", &temp);
/* yylval是yacc的变量,对应.y文件中定义的union类型,在yacc生成的calcul.tab.h文件中有 extern YYSTYPE yylval;这个变量用于flex与 yacc之间传递数据,yyval传递到yacc后对应我们使用的$1,$2*/
yylval.double_value = temp;
return DOUBLE_LITERAL;
}
[ \t] ;
. {
fprintf(stderr, "lexical error.\n");
exit(1);
}
%%
Ref:
《lex and yacc--second edition》 – 作者:John R. Levine