bison & flex

来自: http://hi.baidu.com/bihailan/blog/item/f12e78c82b2ae21c7e3e6ff8.html

bison 是替代yacc的语法分析程序生成器. yacc是 Yet Another Compiler Compiler的缩写. bison又是什么呐 是一个生成可以分析文本文件结构的程序的程序. 用户不用直接编写程序而只用确定好如何分析这些文本文件的规则就可以了. 这种文本结构应用的例子举不胜举, 其中一个就是计算器(calculator).

Introduction to Bison

Bison is a general-purpose parser generator that converts an annotated context-free grammar into an LALR(1) or GLR parser for that grammar. Once you are proficient with Bison, you can use it to develop a wide range of language parsers, from those used in simple desk calculators to complex programming languages.

Bison is upward compatible with Yacc: all properly-written Yacc grammars ought to work with Bison with no change. Anyone familiar with Yacc should be able to use Bison with little trouble. You need to be fluent in C or C++ programming in order to use Bison.

Bison

安装指导参见: the 节 called 安装 Bison-1.875 在 第 6 章.

Bison的内容

Bison 是替代yacc的语法解析器. Bison能生成可以分析文本文件结构的程序.

安装下列程序: bison 和 yacc

安装下列库文件: liby.a

简短说明

bison 是替代yacc的语法分析程序生成器. yacc是 Yet Another Compiler Compiler(又一个编译器的编译器)的缩写.

yacc是bison的包装脚本,实际上是以-y的参数调用bison. 这个是为了和那些用yacc而不是bison的程序兼容.

liby.a 是 Yacc 库,包含了与Yacc兼容的 yyerror 和主要函数。通常这个库没什么用,但 POSIX 要求有它.

Bison 安装依赖关系

Bison 依赖于: Bash, Binutils, Coreutils, Diffutils, GCC, Gettext, Glibc, Grep, M4, Make, Sed.

Flex description

Flex is a Fast Lexical Analyzer.

Flex is a fast lexical analyzer generator. It is a tool for generating programs that perform pattern-matching on text. Flex is a non-GNU free implementation of the well known Lex program.

Flex is a tool for generating scanners: programs which recognized lexical patterns in text. flex reads the given input files, or its standard input if no file names are given, for a description of a scanner to generate. The description is in the form of pairs of regular expressions and C code, called rules. flex generates as output a C source file, `lex.yy.c', which defines a routine `yylex()'. This file is compiled and linked with the `-lfl' library to produce an executable. When the executable is run, it analyzes its input for occurrences of the regular expressions. Whenever it finds one, it executes the corresponding C code.

Some simple examples

First some simple examples to get the flavor of how one uses flex. The following flex input specifies a scanner which whenever it encounters the string "username" will replace it with the user's login name: 

%%
username printf( "%s", getlogin() );

By default, any text not matched by a flex scanner is copied to the output, so the net effect of this scanner is to copy its input file to its output with each occurrence of "username" expanded. In this input, there is just one rule. "username" is the pattern and the "printf" is the action. The "%%" marks the beginning of the rules. 

Here's another simple example: 

int num_lines = 0, num_chars = 0;

%%
n ++num_lines; ++num_chars;
. ++num_chars;

%%
main()
{
yylex();
printf( "# of lines = %d, # of chars = %dn",
num_lines, num_chars );
}

This scanner counts the number of characters and the number of lines in its input (it produces no output other than the final report on the counts). The first line declares two globals, "num_lines" and "num_chars", which are accessible both inside `yylex()' and in the `main()' routine declared after the second "%%". There are two rules, one which matches a newline ("n") and increments both the line count and the character count, and one which matches any character other than a newline (indicated by the "." regular expression). 

A somewhat more complicated example: 

/* scanner for a toy Pascal-like language */

%{
/* need this for the call to atof() below */
#include < math.h >
%}

DIGIT [0-9]
ID [a-z][a-z0-9]*

%%

{DIGIT}+ {
printf( "An integer: %s (%d)n", yytext,
atoi( yytext ) );
}

{DIGIT}+"."{DIGIT}* {
printf( "A float: %s (%g)n", yytext,
atof( yytext ) );
}

if|then|begin|end|procedure|function {
printf( "A keyword: %sn", yytext );
}

{ID} printf( "An identifier: %sn", yytext );

"+"|"-"|"*"|"/" printf( "An operator: %sn", yytext );

"{"[^}n]*"}" /* eat up one-line comments */

[ tn]+ /* eat up whitespace */

. printf( "Unrecognized character: %sn", yytext );

%%

main( argc, argv )
int argc;
char **argv;
{
++argv, --argc; /* skip over program name */
if ( argc > 0 )
yyin = fopen( argv[0], "r" );
else
yyin = stdin;

yylex();
}

This is the beginnings of a simple scanner for a language like Pascal. It identifies different types of tokens and reports on what it has seen. 

The details of this example will be explained in the following sections.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值