Parsing Techniques and Toolkits

 

Parsing Techniques and Toolkits

 

 

1、The Compiler Generator Coco/R

URL:http://www.ssw.uni-linz.ac.at/Coco

Hanspeter Mössenböck ,  Markus Löberbauer ,  Albrecht Wöß , University of Linz

Last update: Jan 12, 2010


Documentation   |   Coco/R for C# ,   Java ,   C++ ,   F# ,   VB.Net ,   Oberon ,   other languages   |   Contributions   |   Cookbook   |   Tools   |   Mailing list   |   Bugzilla

Coco/R is a compiler generator, which takes an attributed grammar of a source language and generates a scanner and a parser for this language. The scanner works as a deterministic finite automaton. The parser uses recursive descent. LL(1) conflicts can be resolved by a multi-symbol lookahead or by semantic checks. Thus the class of accepted grammars is LL(k ) for an arbitrary  k .

There are versions of Coco/R for different languages (see below). The latest versions from the University of Linz are those for C#, Java and C++, which can be downloaded from this site. An older (non-reentrant) version of Coco/R for C# and Java can be obtained from  here .

 

2、ANTLR , ANother Tool for Language Recognition

URL:http://www.antlr.org/

What is ANTLR?
ANTLR , ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of  target languages . ANTLR provides excellent support for tree construction, tree walking, translation, error recovery, and error reporting. There are currently about  5,000   ANTLR source downloads a month.

 

3、The LEMON Parser Generator

URL:http://www.hwaci.com/sw/lemon/

The Lemon program is an LALR(1) parser generator. It takes a context free grammar and converts it into a subroutine that will parse a file using that grammar.

Lemon is similar to the much more famous programs "YACC" and "BISON". But lemon is not compatible with either yacc or bison. There are several important differences:

  • Lemon using a different grammar syntax which is less prone to programming errors.
  • The parser generated by Lemon is both re-entrant and thread-safe.
  • Lemon includes the concept of a non-terminal destructor, which makes it much easier to write a parser that does not leak memory.

The complete source code to the lemon parser generator is contained in two files. The file  lemon.c   is the parser generator program itself. A separate file  lempar.c   is the template for the parser subroutine that lemon generates.  Documentation   on lemon is also available.

 

4、flex: The Fast Lexical Analyzer

Flex is a tool for generating scanners. A scanner, sometimes called a tokenizer, is a program which recognizes lexical patterns in text. The flex program reads user-specified input files, or its standard input if no file names are given, for a description of a scanner to generate. The description is in the form of pairs of regular expressions and C code, called rules. Flex generates a C source file named, "lex.yy.c", which defines the function yylex(). The file "lex.yy.c" can be compiled and linked to produce an executable. When the executable is run, it analyzes its input for occurrences of text matching the regular expressions for each rule. Whenever it finds a match, it executes the corresponding C code.

 

5、Yacc: Yet Another Compiler-Compiler

Computer program input generally has some structure; in fact, every computer program that does input can be thought of as defining an ``input language'' which it accepts. An input language may be as complex as a programming language, or as simple as a sequence of numbers. Unfortunately, usual input facilities are limited, difficult to use, and often are lax about checking their inputs for validity.

Yacc provides a general tool for describing the input to a computer program. The Yacc user specifies the structures of his input, together with code to be invoked as each such structure is recognized. Yacc turns such a specification into a subroutine that handles the input process; frequently, it is convenient and appropriate to have most of the flow of control in the user's application handled by this subroutine.

The input subroutine produced by Yacc calls a user-supplied routine to return the next basic input item. Thus, the user can specify his input in terms of individual input characters, or in terms of higher level constructs such as names and numbers. The user-supplied routine may also handle idiomatic features such as comment and continuation conventions, which typically defy easy grammatical specification.

 

The Lex & Yacc Page

 

 

6、 Parsing Techniques - A Practical Guide

URL:http://www.cs.vu.nl/~dick/PT2Ed.html

Dick Grune and Ceriel J.H. Jacobs

VU University Amsterdam, Amsterdam, The Netherlands

This is the new 662-page edition of Parsing Techniques - A Practical Guide. Like its predecessor, it treats parsing in its own right, in greater depth than is found in most computer science and linguistics books. It offers a clear, accessible, and thorough discussion of many different parsing techniques with their interrelations and applicabilities, including error recovery techniques. Unlike most books, it treats (almost) all parsing methods, not just the popular ones, as can be seen from its Table of Contents. Web site additions (see below) extend the number of pages to 801.

The new edition features: generalized deterministic parsers, non-canonical parsers, linear-time substring parsing, parsing as intersection, and parallel parsing, in addition to the expanded and updated text of the first edition. And there are hundreds of additional literature summaries!

 

...

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值