词法分析器

最新推荐文章于 2023-03-03 17:16:10 发布

zhixingheyi_tian

最新推荐文章于 2023-03-03 17:16:10 发布

阅读量214

点赞数

分类专栏：编译原理人工智能自动机文章标签：人工智能自动机编译原理

本文链接：https://blog.csdn.net/zhixingheyi_tian/article/details/80036817

版权

人工智能同时被 3 个专栏收录

20 篇文章 0 订阅

订阅专栏

编译原理

5 篇文章 0 订阅

订阅专栏

自动机

5 篇文章 0 订阅

订阅专栏

lex词法分析器用于消除二义性的两条规则

There are two important disambiguation rules used by Lex and other similar lexical-analyzer generators:

Longest match: The longest initial substring of the input that can
match any regular expression is taken as the next token.
Rule priority: For a particular longest initial substring, the first
regular expression that can match determines its token type. This
means that the order of writing down the regular-expression rules has
significance.

lex是基于DFA 实现的

DFA construction is a mechanical task easily performed by computer, so it
makes sense to have an automatic lexical analyzer generator to translate regular expressions into a DFA.

lex的输出是C程序

Lex is a lexical analyzer generator that produces a C program from a lexical specification. For each token type in the programming language to be lexically analyzed, the specification contains a regular expression and an action. The action communicates the token type (perhaps along with other information) to the next phase of the compiler.
The output of Lex is a program in C – a lexical analyzerexecutes the action
fragments on each match. The action fragments are just C statements that
return token values.

lex的前世今生

Lex was the first lexical-analyzer generator based on regular expressions
[Lesk 1975]; it is still widely used.

DFA transition tables can be very large and sparse. If represented as a simple two-dimensional matrix (states × symbols) they take far too much memory. In practice, tables are compressed; this reduces the amount of memory
required, but increases the time required to look up the next state [Aho et al.
1986].

flex 比lex快，case语句执行效率相当快。flex和bison已被证明比原来的Unix工具lex yacc更可靠、更强大、更快

Automatically generated lexical analyzers are often criticized for being
slow. In principle, the operation of a finite automaton is very simple and
should be efficient, but interpreting from transition tables adds overhead.
Gray [1988] shows that DFAs translated directly into executable code (implementing states as case statements) can run as fast as hand-coded lexers. The
Flex “fast lexical analyzer generator” [Paxson 1995] is significantly faster
than Lex.