形式语言与自动机 03 Finite Automata

最新推荐文章于 2024-07-23 22:06:37 发布

Hurry_11

最新推荐文章于 2024-07-23 22:06:37 发布

阅读量713

点赞数

分类专栏：形式语言与自动机文章标签：算法

本文链接：https://blog.csdn.net/sajasdf/article/details/127943602

版权

形式语言与自动机专栏收录该内容

4 篇文章 1 订阅

订阅专栏

Finite Automata

What is a Finite Automata?

A formal system
Remembers only a finite amount of information
Information represented by its state
State changes in response to inputs
Rules that tell how the state changes in response to inputs are called transitions

Tennis

Acceptance of Inputs

Given a sequence of inputs, start in the start state and follow the transition from each symbol in turn
Input is accepted if you wind up in a final state after all inputs have been read

Language of an Automaton

The set of strings accepted by an automaton A is the language of A.
Denoted L(A).
Different sets of final states -> different languages.
Example: As designed, L(Tennis) = strings that determine the winner.

Deterministic Finite Automata

Alphabets, Strings, and Languages
Transition Graphs and Tables
Some Proof Techniques

Alphabets

An alphabet is any finite set of symbols

Strings

A string over an alphabet $\sum $ is a list, each element of which is a member of $\sum $
$\sum ^* = $ set of all strings over alphabet $\sum$
The length of a string is its number of positions
$\epsilon$ stands for the empty string (string of length 0).

Languages

A language is a subset of $\sum ^* $ for some alphabet $\sum$

Deterministic Finite Automata

A formalism for defining languages, consisting of :
1. A finite set of states ( $Q, t y p i c a l l y$ )
2. An input alphabet ( $\sum,typically$ )
3. A transition function ( $\delta , typically$ )
4. A start state ( $q_0,in \:Q,typically$ )
5. A set of final states ( $\subseteq Q,typically$ )

The Transition Function

Takes two arguments: a state and an input symbol
$\delta(q,a) \: =$ the state that the DFA goes to when it is in state $q$ and input $a$ ,is received.
Note: $\delta$ is a total function: always a next state - add a dead state if no transition (Example on next slide).

Graph Representation of DFA’ s

Nodes = states
Arc represents transition function
- Arc from state p to state q labeled by all those input symbols that have transitions from p to q
Arrow labeled “Start” to the start state.
Final states indicated by double circles.

Example: Recognizing Strings Ending in “ing”

Alternative Representation: Transition Table

Convention: Strings and Symbols

… w,x,y,z are strings.
a,b,c,… are single input symbols

Extended Transition Function

We describe the effect of a string of inputs on a DFA by extending $\delta$ to a state and a string.
Intuition: Extended $\delta$ is computed for state q and inputs $a_1a_2...a_n$ by following a path in the transition graph, starting at q and selecting the arcs with labels $a_1,a_2,...,a_n$ in turn.

Inductive Definition of Extended $ \delta $

Induction on length of string.
Basis: $\delta(q,\epsilon) \: = q$
Induction: $\delta(q,wa) = \delta(\delta(q,w),a)$
- Remember: w is a string; a is an input symbol, by convention.

Delta-hat

We don’t distinguish between the given delta and the extended delta or delta-hat.
The reason:
$\delta(q,a) = \delta(\delta(q,\epsilon),a) = \delta(q,a)$

Language of a DFA

Automata of all kinds define languages.
If A is an automaton, L(A) is its language.
For a DFA A, L(A) is the set of strings labeling paths from the start state to a final state.
Formally: L(A) = the set of strings w such that $\delta(q_0,w)$ is in F.

Proofs of Set Equivalence

Often, we need to prove that two descriptions of sets are in fact the same set.
Here, one set is “the language of this DFA,” and the other is “the set of strings of 0’ s and 1’ s with no consecutive 1’ s.”
In general, to prove S = T, we need to prove two parts: $\subseteq T$ and $\subseteq S$ . That is:
1. If w is in S, then w is in T.
2. If w is in T, then w is in S.
Here, S = the language of our running DFA, and T = “no consecutive 1’ s.”

Part 1: $S\subseteq T$

To prove: if w is accepted by then w has no consecutive 1’ s.
Proof is an induction on length of w.
Important trick: Expand the inductive hypothesis to be more detailed than the statement you are trying to prove.

The Inductive Hypothesis

If $δ (A, w) = A$ , then w has no consecutive 1’ s and does not end in 1.
If $δ (A, w) = B$ , then w has no consecutive 1’ s and ends in a single 1.
Basis: |w| = 0; i. e. , w = $\epsilon$ .
1. holds since ε has no 1’ s at all.
2. holds vacuously, since δ(A, ε) is not B. //if 不成立，then 自然为真

Inductive Step

Assume (1) and (2) are true for strings shorter than w, where |w| is at least 1
Because w is not empty, we can write w = xa, where a is the last symbol of w, and x is the string that precedes
IH is true for x
Need to prove (1) and (2) for w = xa
（1） for w is: If $\delta(A,w) = A$ ，then w has no consecutive 1’ s and does not end in 1
Since $\delta(A,w) = A$ ， $\delta(A,w)$ must be A or B, and a must be 0
By the IH, x has no 11 's
Thus, w has no 11’ s and does not end in 1
Now, prove (2) for w xa: If $\delta(A,w) = B$ , then w has no 11’ s and ends in 1
Since $\delta(A,w) =B$ , $\delta(A,x)$ must be A, and a must be 1
By the IH, x has no 11’ s and does not end in 1
Thus, w has no 11’ s and ends in 1

Part 2: $T\subseteq S$

Now, we must prove: if w has no 11’ s, then w is accepted by that example
Contrapositive: If w is not accepted by that, then w has 11

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qy4cM4aa-1668870017714)(https://s2.loli.net/2022/09/28/ak7rSsdQ4tmyjiP.png)]

Using the Contrapositive

The only way w is not accepted is if it gets to C

The only way to get to C is if w = x 1 y, x gets to B and y is the tail of w
If $\delta(A,x) = B$ ,then surely x = z 1 for some z
Thus, w = z 11 y and has 11

Regular Languages

Language L is regular is it is the language accepted by some DFA
- Note: the DFA must accept only the strings in L, no others
Some languages are not regular
- Intuitively, regular languages “cannot count” to arbitrarily high integers

Example: A Nonregular Language

$L_1 = \{0^n 1^n | n \ge 1\}$

Note: $a^i$ is conventional for i a’ s
Rea: “The set of strings consisting of n 0’ s followed by n 1’ s, such that n is at least 1”
Thus, $L_1 = \{ 01,0011,000111,...\}$
Proof ?
Suppose there is a DFA with m states
$S_0 0^m 1^m \rightarrow … \rightarrow S_1 0^{m-1} 1^m \rightarrow … \rightarrow S_{2m-1} \rightarrow S_2m $
For the first m moves, there are m+1 states
PHP ! At least one state happen more than once
Suppose the state is q
$S_i = S_j = q$
$S_0 0^m 1^m \rightarrow q0^{m-i}1^m \rightarrow ... \rightarrow q0^{m-j}1^m \rightarrow ... \rightarrow S_{2m}$
How about $S_0 0^{m-j+i} 1^m$

Example: A Regular Language

$L_3 = \{ w | w in \{0,1 \}^* and\; w, viewed \, as \, a\, binary \, integer \, is \, divisible \, by \, 23\}$

The DFA:
- 23 states, named 0, 1,…, 22
- Correspond to the 23 remainders of an integer divided by 23
- Start and only final state is 0

Transitions of the DFA for $L_3$

If string w represents integer i, then assume $\delta (0,w) = i \% 23$
Then w0 represents integer 2i, so we want $\delta(i \% 23,0) = (2i) \% 23$
Similarly: w1 represents 2i+1, so we want $\delta(i\% 23,1) = (2i + 1)\% 23$
Example: $\delta(15,0) = 30 \% 23 = 7; \delta(11,1) = 23\% 23 = 0$

Another Example

$L_4 = \{ w | w \; in \{ 0,1\}\}$ and w, viewed as the reverse of a binary integer is divisible by 23

Example: 01110100 is in $L_4$ , because 46/23 == 2
Hard to construct the DFA
But there is a theorem that says the reverse of regular is also regular

Nondeterministic Finite Automata

非确定性有穷自动机

Nondeterminism

A nondeterministic finite automaton has the ability to be in several at once
Transitions from a state on an input symbol can be to any set of states
Start in one start state
Accept if any sequence of choices leads to a final state
Intuitively: the NFA always “guesses right”

Example: Moves on a Chessboard

States = squares
Inputs = r(move to an adjacent red square) and b (move to an adjacent black square)
Start state, final state are in opposite corners

Formal NFA

A finite set of states, typically Q
An input alphabet, typically $\Sigma$
A transition function, typically $\delta$
A start state in Q, typically $q_0$
A set of final states $\subseteq Q$

Transition Function of NFA

$\delta(q,a) $ is a set of states
Extend to strings as follows
Basis: $\delta(q,\epsilon) = \{ q\}$
Induction: $\delta(q,wa) $ = the union over all states p in $\delta(q,w) \, of \, \,\delta(p,a)$

Language of an NFA

A string w is accepted by an NFA if $\delta{(q_0,w)}$ contains at least one final state
The language of the NFA is the set of strings it accepts

Example: Language of an NFA

For our chessboard NFA we saw rbb is accepted
If the input consists of only b’ s, the set of accessible states alternates between {5} and {1,3,7,9}, so only even-length, nonempty strings of b’ s are accepted
What about strings with at least one r?

Equivalence of DFA’ s, NFA’ s

Part 1

A DFA can be turned into an DFA that accepts the same language
If $\delta_D (q,a)$ = p, let the NFA have $\delta_N (q,a)$ = {p}
Then the NFA is always in a set containing exactly one state - the state the DFA is in after reading the same input

Part 2

Surprisingly, for any NFA there is a DFA that accepts the same language
Proof is the subset construction
The number of states of the DFA can be exponential in the number of states of the NFA
Thus, NFA’ s accept exactly the regular languages

Subset Construction

Given an NFA with states Q, inputs $\Sigma$ , transition function $\delta _{N}$ , start state $q_0$ , and final states F, construct equivalent DFA with:
- States $2^Q$ (Set of subsets of Q)
- Inputs $\Sigma$
- Start state ${ q_0 \}$
- Final states = all those with a member of F

Critical Point

The DFA states have names that are sets of NFA states
But as a DFA state, an expression like ${ p,q\}$ must be understood to be a single symbol, not as a set
Analogy: a class of object whose values are sets of objects of another class
The transition function $\delta _D $ is defined by:

$\delta_D (\{ q_1, ...,q_k\} , a)$ is the union over all $i = 1, . . ., k$ of $\delta _D (q_i,a)$

Example

Chessboard

Proof of Equivalence

Basic

The proof is almost a pun
Show by induction on |w| that $\delta_D (q_0, w) = \delta_D(\{q_0\},w)$
Basic: w = $\epsilon : \delta_N (q_0, \epsilon )$ = $\delta_D (\{q_0\},\epsilon) = \{ q_0\}$

Induction

Assume IH for strings shorter than w
Let w = xa; IH holds for x
Let $\delta _N (q_0,x) = \delta(\{q_0\},x)$ = S
Let T = the union over all states p in S of $\delta_N(p,a)$
Then $\delta_N(q_0,w) = \delta_D(\{q_0\},w)$ = T

But

Sub-Construction may lead to Bad case (指数增长)

NFA’ s With $ \epsilon $ - Transitions

We can allow state-to-state transitions on $\epsilon$ input
These transitions are done spontaneously, without looking at the input string
A convenience at times, but still only regular languages are accepted

$ \epsilon $ - NFA

Closure of States

CL(q) = set states you can reach from state q following only arcs labeled $ \epsilon $
CL(A) = {A}
CL(E) = {B,C,D,E}
Closure of a set of states = union of the closure of each state

Extended Delta

Intuition: $\hat{\delta} (q,w)$ is the set of states you can reach from q following a path labled w
Basic: $\hat{\delta} (q,\epsilon) = CL(q)$
Induction: $\hat{\delta}(q,xa)$ is computed by:
- Start with $\hat\delta(q,x) $ = S
- Take the union of CL( $\delta(p,a)$ ) for all p in S

Equivalence of NFA, $ \epsilon $ - NFA

Every NFA is an $\epsilon$ - NFA
- It just has no transitions on $\epsilon$
Converse requires us to take an $\epsilon$ - NFA and construct an NFA that accepts the same language
We do so by combining $\epsilon$ - transitions with the next transition on a real input
Start with an $\epsilon$ - NFA with states Q, inputs $\Sigma$ , start state $q_0$ , final states F, and transition function $\delta_E$
Construct an “ordinary” NFA with sates Q, inputs $\Sigma$ , start state $q_0$ , final states F’ , and transition function $\delta_N$
Compute $\delta_N(q,a) $ as follows:
1. Let S =CL(q)
2. $\delta_N(q,a)$ is the union over all p in S of $\delta_E(p,a)$
F’ = the set of states q such that CL(q) contains a state of F
Prove by induction on |w| that CL( $\delta_N(q_0,w)$ ) = $\hat{\delta_E} (q_0,w)$
Thus, the $\epsilon$ - NFA accepts w if and only if the “ordinary” NFA does