自动机相关总结

最新推荐文章于 2024-04-20 01:30:59 发布

zhixingheyi_tian

最新推荐文章于 2024-04-20 01:30:59 发布

阅读量560

点赞数

分类专栏：自动机编译原理人工智能文章标签：自动机编译原理人工智能

本文链接：https://blog.csdn.net/zhixingheyi_tian/article/details/80032137

版权

人工智能同时被 3 个专栏收录

20 篇文章 0 订阅

订阅专栏

自动机

5 篇文章 0 订阅

订阅专栏

编译原理

5 篇文章 0 订阅

订阅专栏

自动机理论

Automata theory deals with the definitions and properties of mathematical models of computation. These models play a role in several applied areas of computer science.

One model, called the finite automaton, is used in text processing, compilers, and hardware design.
Another model, called the context-free grammar, is used in programming languages and artificial intelligence.

由此可见有穷状态机和有穷自动机是一回事
the simplest model, called the finite state machine or finite automaton.

有穷自动机状态图和形式化定义

we used state diagrams to introduce finite automata.
Now we define finite automata formally. Although state diagrams are easier to
grasp intuitively, we need the formal definition, too, for two specific reasons.
First, a formal definition is precise. It resolves any uncertainties about what
is allowed in a finite automaton. If you were uncertain about whether finite
automata were allowed to have 0 accept states or whether they must have exactly one transition exiting every state for each possible input symbol, you could
consult the formal definition and verify that the answer is yes in both cases. Second, a formal definition provides notation. Good notation helps you think and
express your thoughts clearly.
The language of a formal definition is somewhat arcane, having some similarity to the language of a legal document. Both need to be precise, and every
detail must be spelled out.

有穷自动机的形式化定义（FORMAL DEFINITION）

A finite automaton is a 5-tuple (Q, Σ, δ, q0, F ), where
1. Q is a finite（有穷） set called the states,
2. Σ is a finite set called the alphabet（字母表）,
3. δ : Q × Σ−→Q is the transition function,1
4. q0 ∈ Q is the start state, and
5. F ⊆ Q is the set of accept states.2

正则语言

A language is called a regular language if some finite automaton
recognizes it.

正则运算，类比数学运算

In arithmetic, the basic objects are numbers and the tools are operations for manipulating them, 
such as + and ×. In the theory of computation, the objects are languages and the tools include operations specifically designed for manipulating them. 
We define three operations on languages, called the regular operations, and use them to study properties of the regular languages.

Let A and B be languages. We define the regular operations union,
concatenation, and star as follows:
• Union: A ∪ B = {x| x ∈ A or x ∈ B}.
• Concatenation: A ◦ B = {xy| x ∈ A and y ∈ B}.
• Star: A∗ = {x1x2 . . . xk| k ≥ 0 and each xi ∈ A}

正则表达式，类比数学表达式

In arithmetic, we can use the operations + and × to build up expressions such as
(5 + 3) × 4 .

Similarly, we can use the regular operations to build up expressions
 describing languages, which are called regular expressions.

词法分析器如何识别最长的匹配，且看如下

其实设置两个变量，当走投无路是选择最近的终态。

Keeping track of the longest match just means remembering the last time
the automaton was in a final state with two variables, Last-Final (the state
number of the most recent final state encountered) and Input-Positionat-Last-Final. Every time a final state is entered, the lexer updates these
variables; when a dead state (a nonfinal state with no output transitions) is
reached, the variables tell what token was matched, and where it ended.

There are two important disambiguation rules used by Lex and other similar lexical-analyzer generators:


 - Longest match: The longest initial substring of the input that can
   match any regular expression is taken as the next token.
 -
 - Rule priority: For a particular longest initial substring, the first 
   regular expression that can match determines its token type. This   
   means that the order of writing down the regular-expression rules has significance.

非确定性有穷自动机

When the machine is in a given state and
reads the next input symbol, we know what the next state will be—it is determined. We call this deterministic computation. In a nondeterministic machine,
several choices may exist for the next state at any point.

非确定有穷自动机和确定性有穷自动机的区别


 - The difference between a deterministic finite automaton, abbreviated
   DFA, and a nondeterministic finite automaton, abbreviated NFA, is
   immediately apparent. First, every state of a DFA always has exactly
   one exiting transition arrow for each symbol in the alphabet. The NFA
   shown in Figure 1.27 violates that rule. State q1 has one exiting
   arrow for 0, but it has two for 1; q2 has one arrow for 0, but it has
   none for 1. In an NFA, a state may have zero, one, or many exiting
   arrows for each alphabet symbol.
 - Second, in a DFA, labels on the transition arrows are symbols from
   the alphabet. This NFA has an arrow with the label ε. In general, an
   NFA may have arrows labeled with members of the alphabet or ε. Zero,
   one, or many arrows may exit from each state with the label ε.

非确定性有穷自动机的用处

Nondeterministic finite automata are useful in several respects. As we will
show, every NFA can be converted into an equivalent DFA, and constructing
NFAs is sometimes easier than directly constructing DFAs. An NFA may be much
smaller than its deterministic counterpart, or its functioning may be easier to
understand. Nondeterminism in finite automata is also a good introduction
to nondeterminism in more powerful computational models because finite automata are especially easy to understand.

DFA与NFA是等价的

Deterministic and nondeterministic finite automata recognize the same class of
languages. Such equivalence is both surprising and useful. It is surprising because NFAs appear to have more power than DFAs, so we might expect that NFAs
recognize more languages. It is useful because describing an NFA for a given
language sometimes is much easier than describing a DFA for that language.
Say that two machines are equivalent if they recognize the same language.
Every nondeterministic finite automaton has an equivalent deterministic finite
automaton.