INT201-Decision, Computation and Language(1)


1. DFA

DFA is a finite-state machine that accepts or rejects a given string of symbols, by running through a state sequence uniquely determined by the string.

A DFA is defined as a 5-tuple: M = ( Q , Σ , δ , q , F ) M=(Q,\Sigma,\delta,q,F) M=(Q,Σ,δ,q,F)

  1. Q Q Q is a finite set of states.
  2. Σ \Sigma Σ is a finite set of symbols, called the alphabet of the automaton.
  3. δ : Q × Σ → Q \delta:Q\times\Sigma\rightarrow Q δ:Q×ΣQ is a function, called the transition function.
  4. q ∈ Q q\in Q qQ is called the initial state.
  5. F ⊆ Q F\subseteq Q FQ is a set of accepting / terminal states.

We can extend the definition of the transition function δ \delta δ so that it tells us which state we reach after a word Σ ∗ \Sigma^* Σ (not just a single letter) has been scanned:
extend the map δ : Q × Σ → Q \delta:Q\times\Sigma\rightarrow Q δ:Q×ΣQ to δ ∗ : Q × Σ ∗ → Q \delta^*:Q\times\Sigma^*\rightarrow Q δ:Q×ΣQ by defining:
δ ∗ ( q , ϵ ) = q      for all  q ∈ Q δ ∗ ( q , w a ) = δ ( δ ∗ ( q , w ) , a )      for all  q ∈ Q , w ∈ Σ ∗ , a ∈ Σ δ ∗ ( q , v w ) = δ ∗ ( δ ∗ ( q , v ) , w )      for all  q ∈ Q , v , w ∈ Σ ∗ \begin{aligned}\delta^*(q,\epsilon)=q~~~~&\text{for all }q\in Q\\\delta^*(q,wa)=\delta(\delta^*(q,w),a)~~~~&\text{for all }q\in Q,w\in\Sigma^*,a\in\Sigma\\\delta^*(q,vw)=\delta^*(\delta^*(q,v),w)~~~~&\text{for all }q\in Q,v,w\in\Sigma^*\end{aligned} δ(q,ϵ)=q    δ(q,wa)=δ(δ(q,w),a)    δ(q,vw)=δ(δ(q,v),w)    for all qQfor all qQ,wΣ,aΣfor all qQ,v,wΣ

1.1 Language defined by DFA

Suppose we have a DFA M M M, A word w ∈ Σ ∗ w\in\Sigma^* wΣ is said to be accepted or recognized by M M M if δ ∗ ( q 0 , w ) ∈ F \delta^*(q_0,w)\in F δ(q0,w)F, otherwise it is said to be rejected. The set of all words accepted by M M M is called the language accepted by M M M and will be denoted by L ( M ) L(M) L(M).
L ( M ) = { w ∈ Σ ∗ : δ ∗ ( q 0 , w ) ∈ F } L(M)=\{w\in\Sigma^*:\delta^*(q_0,w)\in F\} L(M)={wΣ:δ(q0,w)F}
Any finite language is accepted by some DFA


1.2 Regular operations on languages

A language A A A is called regular, if there exists a finite automaton M M M such that A = L ( M ) A=L(M) A=L(M)

Let A A A and B B B be two languages over the same alphabet.
The union of A A A and B B B is defined as: A ∪ B = { w : w ∈ A  or  w ∈ B } A\cup B=\{w:w\in A\text{ or }w\in B\} AB={w:wA or wB}The concatenation: A B = { w w ′ : w ∈ A  and  w ′ ∈ B } AB=\{ww^\prime:w\in A\text{ and }w^\prime\in B\} AB={ww:wA and wB}The Kleene star of A A A is defined as: A ∗ = ⋃ i ∈ N A i = A 0 ∪ A 1 ∪ A 2 ⋯ A^*=\bigcup\limits_{i\in N}A_i=A_0\cup A_1\cup A_2\cdots A=iNAi=A0A1A2where A 0 = { ϵ } A 1 = A A i + 1 = { w v : w ∈ A i , v ∈ A } \begin{aligned}A_0&=\{\epsilon\}\\A_1&=A\\A_{i+1}&=\{wv:w\in A_i,v\in A\}\end{aligned} A0A1Ai+1={ϵ}=A={wv:wAi,vA}



2. NFA

A finite automata is deterministic, if the next state the machine goes to on any given symbol is uniquely determined.

DFA has exactly one transition leaving each state for each symbol.

A finite automata is nondeterministic, if the machine allows for several or no choices to exist for the next state on a given symbol. For a state q q q and symbol s ∈ Σ s\in\Sigma sΣ, NFA can have:

  • Multiple edges leaving q q q labelled with the same symbol s s s
  • No edge leaving q q q labelled with symbol s s s
  • Edge leaving q q q labelled with ϵ \epsilon ϵ (without reading any symbol)

The machine splits into multiple copies of itself (threads):

  • Each copy proceeds with computation independently of others
  • NFA may be in a set of states, instead of a single state.
  • NFA follows all possible computation paths in parallel.
  • If a copy is in a state and next input symbol doesn’t appear on any outgoing edge from the state, then the copy dies or crashes.
  • NFA accepts the input string, if any copy ends in an accept state after reading the entire string.

For any alphabet Σ \Sigma Σ, we define Σ ϵ = Σ ∪ { ϵ } \Sigma_\epsilon=\Sigma\cup\{\epsilon\} Σϵ=Σ{ϵ}
NFA is a 5-tuple M = ( Q , Σ , δ , q , F ) M=(Q,\Sigma,\delta,q,F) M=(Q,Σ,δ,q,F)

  1. Q Q Q is a finite set of states
  2. Σ \Sigma Σ is a finite set of symbols, called the alphabet of the automaton
  3. δ : Q × Σ ϵ → P ( Q ) \delta:Q\times\Sigma_\epsilon\rightarrow P(Q) δ:Q×ΣϵP(Q) is a function, called the transition function
  4. q ∈ Q q\in Q qQ is called the initial / start state
  5. F ⊆ Q F\subseteq Q FQ is a set of accepting / terminal states

Let M = ( Q , Σ , δ , q , F ) M=(Q,\Sigma,\delta,q,F) M=(Q,Σ,δ,q,F) be an NFA, and let w ∈ Σ ∗ w\in \Sigma^* wΣ. We say that M M M accepts w w w, if δ ∗ ( q 0 , w ) ∈ F \delta^*(q_0,w)\in F δ(q0,w)F

Extend the map δ \delta δ to a map Q × Σ ∗ → P ( Q ) Q\times\Sigma^*\rightarrow P(Q) Q×ΣP(Q) by defining: δ ( q , ϵ ) = { q }      for all  q ∈ Q δ ( q , w a ) = ⋃ p ∈ δ ( q , w ) δ ( p , a )      for all  q ∈ Q , w ∈ Σ ∗ , a ∈ Σ \begin{aligned}\delta(q,\epsilon)=\{q\}~~~~&\text{for all }q\in Q\\\delta(q,wa)=\bigcup\limits_{p\in\delta(q,w)}\delta(p,a)~~~~&\text{for all }q\in Q,w\in\Sigma^*,a\in\Sigma\end{aligned} δ(q,ϵ)={q}    δ(q,wa)=pδ(q,w)δ(p,a)    for all qQfor all qQ,wΣ,aΣ

Suppose, in a DFA, we can get from state p p p to state q q q via transitions labelled by letters of a word w w w. Then we say that the states p p p and q q q are connected by a path with label w w w.

In a NFA, if δ ( p , a ) = { q , r } \delta(p,a)=\{q,r\} δ(p,a)={q,r} we could write: { p } ⟶ a { q , r } \{p\}\stackrel{a}{\longrightarrow}\{q,r\} {p}a{q,r}

2.1 Language accepted by NFA

Let M = ( Q , Σ , δ , a , F ) M=(Q,\Sigma,\delta,a,F) M=(Q,Σ,δ,a,F) be an NFA. The language L ( M ) L(M) L(M) accepted by M M M is defined as L ( M ) = { w ∈ Σ ∗ : M  accepts  w } L(M)=\{w\in\Sigma^*:M\text{ accepts } w\} L(M)={wΣ:M accepts w}

2.2 Equivalence of DFAs and NFAs

Two machines are equivalent if they recognize the same language.

DFA is a restricted form of NFA:

  • Every NFA has an equivalent DFA
  • We can convert an arbitrary NFA to a DFA that accepts the same language
  • DFA has the same power as NFA

2.3 DFA to NFA

The formal conversion of a DFA to an NFA is done as follows: Let M = ( Q , Σ , δ , q , F ) M=(Q,\Sigma,\delta,q,F) M=(Q,Σ,δ,q,F) be a DFA. δ \delta δ is a function δ : Q × Σ → Q \delta:Q\times\Sigma\rightarrow Q δ:Q×ΣQ. We define the function δ ′ : Q × Σ ϵ → P ( Q ) \delta^\prime:Q\times\Sigma_\epsilon\rightarrow P(Q) δ:Q×ΣϵP(Q). For any r ∈ Q r\in Q rQ and for any a ∈ Σ ϵ a\in\Sigma_\epsilon aΣϵ: δ ′ ( r , a ) = { { δ ( r , a ) } if  a ≠ ϵ ϕ if  a = ϵ \delta^\prime(r,a)=\begin{cases}\{\delta(r,a)\}&\text{if }a\ne\epsilon\\\phi&\text{if }a=\epsilon\end{cases} δ(r,a)={{δ(r,a)}ϕif a=ϵif a=ϵThen N = ( Q , Σ , δ ′ ) N=(Q,\Sigma,\delta^\prime) N=(Q,Σ,δ) is an NFA, whose behavior is exactly the same as that of the DFA M M M, the easiest way to see this is by observing that the state diagrams of M M M and N N N are equal. Therefore, we have L ( M ) = L ( N ) L(M)=L(N) L(M)=L(N)


2.4 NFA to DFA

Definition 1
The ϵ − closure \epsilon-\text{closure} ϵclosure of a set of states R ⊆ Q R\subseteq Q RQ:
E ( R ) = { q ∣ q  can be reached from  R  by travelling over zero or more  ϵ  transitions  } E(R)=\{q|q \text{ can be reached from }R\text{ by travelling over zero or more }\epsilon\text{ transitions }\} E(R)={qq can be reached from R by travelling over zero or more ϵ transitions }

Definition 2
Suppose that there is a set of states R R R and a ∈ Σ a\in\Sigma aΣ, we say that R a = ϵ − closure ( J ) R_a=\epsilon-\text{closure}(J) Ra=ϵclosure(J) where J J J is the set that can be reached from R R R by travelling over a a a



3. Regular Language

Definition
Previous: A language is regular if it is recognized by some DFA
Now: A language is regular if and only if some NFA recognizes it
Some operations on languages: Union, Concatenation and Kleene star


3.1 Closed under operation

A collection S S S of objects is closed under operation f f f if applying f f f to members of S S S always returns an object still in S S S.

Regular languages are indeed closed under the regular operations (Union, Concatenation, Kleene star)

3.1.1 Regular Languages Closed Under Union

Proof
A A A and B B B are regular languages over the same alphabet Σ \Sigma Σ, there are automata M 1 = ( Q 1 , Σ , δ 1 , q 1 , F 1 ) M_1=(Q_1,\Sigma,\delta_1,q_1,F_1) M1=(Q1,Σ,δ1,q1,F1) and M 2 = ( Q 2 , Σ , δ 2 , q 2 , F 2 ) M_2=(Q_2,\Sigma,\delta_2,q_2,F_2) M2=(Q2,Σ,δ2,q2,F2) that accept A A A and B B B, respectively.

We can define M = ( Q , Σ , δ , q , F ) M=(Q,\Sigma,\delta,q,F) M=(Q,Σ,δ,q,F) where:

  • Q = Q 1 × Q 2 = { ( q 1 , q 2 ) : q 1 ∈ Q 1  and  q 2 ∈ Q 2 } Q=Q_1\times Q_2=\{(q_1,q_2):q_1\in Q_1\text{ and }q_2\in Q_2\} Q=Q1×Q2={(q1,q2):q1Q1 and q2Q2}
  • q = ( q 1 , q 2 ) q = (q_1,q_2) q=(q1,q2)
  • F = { ( q 1 , q 2 ) : q 1 ∈ F 1  or  q 2 ∈ F 2 } F=\{(q_1,q_2):q_1\in F_1\text{ or }q_2\in F_2\} F={(q1,q2):q1F1 or q2F2}
  • δ : δ ( ( q 1 , q 2 ) , a ) = ( δ ( q 1 , a ) , δ ( q 2 , a ) ) , a ∈ Σ \delta:\delta((q_1,q_2),a)=(\delta(q_1,a),\delta(q_2,a)),a\in\Sigma δ:δ((q1,q2),a)=(δ(q1,a),δ(q2,a)),aΣ

Then: M  accept  w    ⟺    δ ∗ ( ( q 1 , q 2 ) , w ) ∈ F    ⟺    δ ∗ ( q 1 , w ) ∈ F 1  or  δ ∗ ( q 2 , w ) ∈ F 2 M\text{ accept } w\iff\delta^*((q_1,q_2),w)\in F\iff\delta^*(q_1,w)\in F_1\text{ or }\delta^*(q_2,w)\in F_2 M accept wδ((q1,q2),w)Fδ(q1,w)F1 or δ(q2,w)F2

So that L ( M ) = L ( M 1 ) ∪ L ( M 2 ) L(M) = L(M_1)\cup L(M_2) L(M)=L(M1)L(M2)

Proof from the perspective of NFA
Consider M 1 , M 2 M_1,M_2 M1,M2 are NFAs, we assume that Q 1 ∩ Q 2 = ∅ Q_1\cap Q_2=\varnothing Q1Q2=
We can define M = ( Q , Σ , δ , q , F ) M=(Q,\Sigma,\delta,q,F) M=(Q,Σ,δ,q,F) where:

  • Q = { q 0 } ∪ Q 1 ∪ Q 2 Q=\{q_0\}\cup Q_1\cup Q_2 Q={q0}Q1Q2
  • q 0 q_0 q0 is the start state of M M M
  • F = F 1 ∪ F 2 F=F_1\cup F_2 F=F1F2

Then: δ ( q , a ) = { δ 1 ( q , a ) if  r ∈ Q 1 δ 2 ( q , a ) if  r ∈ Q 2 { q 1 , q 2 } if  r = q 0  and  a = ϵ ∅ if  r = q 0  and  a ≠ ϵ \delta(q,a)=\begin{cases}\delta_1(q,a)&\text{if }r\in Q_1\\\delta_2(q,a)&\text{if }r\in Q_2\\\{q_1,q_2\}&\text{if }r=q_0\text{ and }a=\epsilon\\\varnothing&\text{if }r=q_0\text{ and }a\ne\epsilon\end{cases} δ(q,a)= δ1(q,a)δ2(q,a){q1,q2}if rQ1if rQ2if r=q0 and a=ϵif r=q0 and a=ϵ

用图论的话来说,相当于建一虚点 q 0 q_0 q0,指向两个 NFA 的起点。

3.1.2 Regular Languages Closed Under Concatenation

The concatenation of A 1 A_1 A1 and A 2 A_2 A2 is defined as: A 1 A 2 = { w w ′ : w ∈ A 1  and  w ′ ∈ A 2 } A_1A_2=\{ww^\prime:w\in A_1\text{ and }w^\prime\in A_2\} A1A2={ww:wA1 and wA2}

Proof

  • Q = Q 1 ∪ Q 2 Q=Q_1\cup Q_2 Q=Q1Q2
  • q = q 1 q=q_1 q=q1
  • F = F 2 F=F_2 F=F2

Then: δ ( q , a ) = { δ 1 ( q , a ) if  q ∈ Q 1  and  q ∉ F 1 δ 1 ( q , a ) if  q ∈ F 1  and  a ≠ ϵ δ 1 ( q , a ) ∪ { q 2 } if  q ∈ F 1  and  a = ϵ δ 2 ( q , a ) if  r ∈ Q 2 \delta(q,a)=\begin{cases}\delta_1(q,a)&\text{if }q\in Q_1\text{ and }q\notin F_1\\\delta_1(q,a)&\text{if }q\in F_1\text{ and }a\ne\epsilon\\\delta_1(q,a)\cup\{q_2\}&\text{if }q\in F_1\text{ and }a=\epsilon\\\delta_2(q,a)&\text{if }r\in Q_2\end{cases} δ(q,a)= δ1(q,a)δ1(q,a)δ1(q,a){q2}δ2(q,a)if qQ1 and q/F1if qF1 and a=ϵif qF1 and a=ϵif rQ2

相当于是把所有的 F 1 F_1 F1 连到了 q 2 q_2 q2,有种首尾相接的感觉。

3.1.3 Regular Languages Closed Under Kleene star

Proof

  • Q = { q 0 } ∪ Q 1 Q=\{q_0\}\cup Q_1 Q={q0}Q1
  • q 0 q_0 q0 is the start state of M M M
  • F = { q 0 } ∪ F 1 F=\{q_0\}\cup F_1 F={q0}F1

Then: δ ( q , a ) = { δ 1 ( q , a ) if  q ∈ Q 1  and  q ∉ F 1 δ 1 ( q , a ) if  q ∈ F 1  and  a ≠ ϵ δ 1 ( q , a ) ∪ { q 1 } if  q ∈ F 1  and  a = ϵ { q 1 } if  q = q 0  and  a = ϵ ∅ if  q = q 0  and  a ≠ ϵ \delta(q,a)=\begin{cases}\delta_1(q,a)&\text{if }q\in Q_1\text{ and }q\notin F_1\\\delta_1(q,a)&\text{if }q\in F_1\text{ and }a\ne\epsilon\\\delta_1(q,a)\cup\{q_1\}&\text{if }q\in F_1\text{ and }a=\epsilon\\\{q_1\}&\text{if }q=q_0\text{ and }a=\epsilon\\\varnothing&\text{if }q=q_0\text{ and }a\ne\epsilon\end{cases} δ(q,a)= δ1(q,a)δ1(q,a)δ1(q,a){q1}{q1}if qQ1 and q/F1if qF1 and a=ϵif qF1 and a=ϵif q=q0 and a=ϵif q=q0 and a=ϵ

所有 F F F 都连到了 q 1 q_1 q1 上,白了就是递归

3.1.4 Regular Languages Closed Under Complement and Interaction

If A A A is a regular language over the alphabet Σ \Sigma Σ, then the complement: A ˉ { w ∈ Σ ∗ : w ∉ A } \bar A\{w\in\Sigma^*:w\notin A\} Aˉ{wΣ:w/A}is also a regular language.

If A 1 A_1 A1 and A 2 A_2 A2 are regular languages over the same alphabet Σ \Sigma Σ, then the interaction: A 1 ∩ A 2 = { w ∈ Σ ∗ : w ∈ A 1  and  w ∈ A 2 } A_1\cap A_2=\{w\in\Sigma^*:w\in A_1\text{ and }w\in A_2\} A1A2={wΣ:wA1 and wA2} is also a regular language.


3.2 Regular Expressions

Regular expressions are means to describe certain languages.

Let Σ \Sigma Σ be a non-empty alphabet.

  1. ϵ \epsilon ϵ is a regular expression
  2. ∅ \varnothing is a regular expression
  3. For each a ∈ Σ a\in\Sigma aΣ, a a a is a regular expression
  4. If R 1 R_1 R1 and R 2 R_2 R2 are regular expressions, then R 1 ∪ R 2 R_1\cup R_2 R1R2 is a regular expression, the same as R 1 R 2 R_1R_2 R1R2, R 1 ∗ R_1^* R1

If R R R is a regular expression, then L ( R ) L(R) L(R) is the language generated / described / defined by R R R.

  1. ϵ \epsilon ϵ describes the language { ϵ } \{\epsilon\} {ϵ}
  2. ∅ \varnothing describes the language ∅ \varnothing
  3. For each a ∈ Σ a\in\Sigma aΣ, the regular expression a describes the language { a } \{a\} {a}
  4. If R 1 , R 2 R_1,R_2 R1,R2 are regular expressions and L 1 , L 2 L_1,L_2 L1,L2 are the languages described by them, respectively. R 1 ∪ R 2 R_1\cup R_2 R1R2 describes the language L 1 ∪ L 2 L_1\cup L_2 L1L2, the same as R 1 R 2 R_1R_2 R1R2, R 1 ∗ R_1^* R1

3.3 Kleene’s Theorem

Let L L L be a language. Then L L L is regular iff there exists a regular expression that describes L L L.

  • If a language is described by a regular expression, then it is regular.
  • If a language is regular, then it has a regular expression.

3.4 GNFA

A GNFA can be defined as a 5-tuple ( Q , Σ , δ , { s } , { t } ) (Q,\Sigma,\delta,\{s\},\{t\}) (Q,Σ,δ,{s},{t})

  • Q Q Q is a finite set of states
  • Σ \Sigma Σ is a finite set of alphabet
  • δ : ( Q ∖ { t } ) × ( Q ∖ { s } ) → R \delta:(Q\setminus\{t\})\times(Q\setminus\{s\})\rightarrow R δ:(Q{t})×(Q{s})R
  • s ∈ Q s\in Q sQ
  • t ∈ Q t\in Q tQ

3.4.1 DFA 转 GNFA

Convert a DFA into a regular expression

DFA转GNFA


3.5 Pumping Lemma for Regular Languages

A tool that can be used to prove that certain languages are not regular. This theorem states that all regular languages have a special property.

This property states that all strings in the language can be “pumped” if they are at least as long as a certain special value, called the pumping length. That means each such string contains a section that can be repeated any number of times with resulting string remaining in the language.

  • If a language L L L is regular, it always satisfies pumping lemma. If there exists at least one string made from pumping which is not in L L L, then L L L is surely not regular.
  • If pumping lemma holds, it does not mean that the language is regular.
    在这里插入图片描述


4. Context-Free Languages

4.1 CFG

Context-Free Grammar

A context-free grammar is a 4-tuple G = ( V , Σ , R , S ) G=(V,\Sigma,R,S) G=(V,Σ,R,S), where

  • V V V is a finite set, whose elements are called variables
  • Σ \Sigma Σ is a finite set, whose elements are called terminals
  • V ∩ Σ = ∅ V\cap\Sigma=\varnothing VΣ=
  • S S S is an element of V V V, it is called the start variable
  • R R R is a finite set, whose elements are called rules. Each rule has the form A → w A\rightarrow w Aw, where A ∈ V A\in V AV and w ∈ ( V ∪ Σ ) ∗ w\in(V\cup\Sigma)^* w(VΣ)

Definition 1: yeild ⇒ \Rightarrow
Let G = ( V , Σ , R , S ) G=(V,\Sigma,R,S) G=(V,Σ,R,S) be a context free grammar with

  • A ∈ V A\in V AV
  • u , v , w ∈ ( V ∪ Σ ) ∗ u,v,w\in(V\cup\Sigma)^* u,v,w(VΣ)
  • A → w A\rightarrow w Aw is a rule of the grammar

The string u w v uwv uwv can be derived in one step from the string u A v uAv uAv, written as u A v ⇒ u w v uAv\Rightarrow uwv uAvuwv

Definition 2: derive ⇒ ∗ \stackrel{*}{\Rightarrow}
Let G = ( V , Σ , R , S ) G=(V,\Sigma,R,S) G=(V,Σ,R,S) be a context free grammar with

  • u , v ∈ ( V ∪ Σ ) ∗ u,v\in(V\cup\Sigma)^* u,v(VΣ)

The string v v v can be derived from the string u u u, written as u ⇒ ∗ v u\stackrel{*}{\Rightarrow}v uv, if one of the following conditions holds:

  • u = v u=v u=v
  • there exist an integer k ≥ 2 k\geq2 k2 and a sequence u 1 , u 2 , ⋯   , u k u_1,u_2,\cdots,u_k u1,u2,,uk of strings in ( V ∪ Σ ) ∗ (V\cup\Sigma)^* (VΣ), such that
    • u = u 1 u=u_1 u=u1
    • v = u k v=u_k v=uk and u 1 ⇒ u 2 ⇒ u 3 ⋯ ⇒ u k u_1\Rightarrow u_2\Rightarrow u_3\cdots\Rightarrow u_k u1u2u3uk

4.1.1 Language of CFG

The language of CFG G = ( V , Σ , R , S ) G=(V,\Sigma,R,S) G=(V,Σ,R,S) is L ( G ) = { w ∈ Σ ∗ ∣ S ⇒ ∗ w } L(G)=\{w\in\Sigma^*|S\stackrel*\Rightarrow w\} L(G)={wΣSw}Such a language is called context-free, and satisfies L ( G ) ⊆ Σ ∗ L(G)\subseteq\Sigma^* L(G)Σ

Theorem
Let Σ \Sigma Σ be an alphabet and let L ⊆ Σ ∗ L\subseteq\Sigma^* LΣ be a regular language. Then L L L is a context-free language (Every regular language is context-free)


4.2 CNF

Chomsky Normal Form

A context-free grammar G = ( V , Σ , R , S ) G=(V,\Sigma,R,S) G=(V,Σ,R,S) is said to be in Chomsky normal form, if every rule in R R R has one of the following three forms:

  • A → B C A\rightarrow BC ABC, where A , B , C A,B,C A,B,C are elements of V V V, B ≠ S B\ne S B=S and C ≠ S C\ne S C=S
  • A → a A\rightarrow a Aa, where A A A is an element of V V V and a a a is an element of Σ \Sigma Σ
  • S → ϵ S\rightarrow\epsilon Sϵ, where S S S is the start variable

Grammars in CNF are far easier to analyze.

Theorem
Let Σ \Sigma Σ be an alphabet and let L ⊆ Σ ∗ L\subseteq\Sigma^* LΣ be a CFL. There exists a CFG in CNF, whose language is L L L. That is, every CFL can be described by a CFG in CNF

4.2.1 Converting CFG into CNF

  1. Eliminate the start variable from the right-hand side of the rules.
    • New start variable S 0 S_0 S0
    • New rule S 0 → S S_0\rightarrow S S0S
  2. Remove ϵ \epsilon ϵ-rules A → ϵ A\rightarrow\epsilon Aϵ, where A ∈ V − { S } A\in V-\{S\} AV{S}. When removing A → ϵ A\rightarrow\epsilon Aϵ rules, insert all new replacements
    • Before: B → A b A B\rightarrow AbA BAbA and A → ϵ ∣ ⋯ A\rightarrow\epsilon|\cdots Aϵ
    • After: B → A b A ∣ b A ∣ A b ∣ b B\rightarrow AbA|bA|Ab|b BAbAbAAbb and A → ⋯ A\rightarrow\cdots A
  3. Remove unit rules A → B A\rightarrow B AB, where A ∈ V A\in V AV
    • Before: A → B A\rightarrow B AB and B → x C y B\rightarrow xCy BxCy
    • After: A → x C y A\rightarrow xCy AxCy and B → x C y B\rightarrow xCy BxCy
  4. Eliminate all rules having more than two symbols on the right-hand side.
    • Before: A → B 1 B 2 B 3 A\rightarrow B_1B_2B_3 AB1B2B3
    • After: A → B 1 A 1 , A 1 → B 2 B 3 A\rightarrow B_1A_1, A_1\rightarrow B_2B_3 AB1A1,A1B2B3
  5. Eliminate all rules of the form A → a b A\rightarrow ab Aab, where a a a and b b b are not both variables.
    • Before: A → a b A\rightarrow ab Aab
    • After: A → B 1 B 2 , B 1 → a , B 2 → b A\rightarrow B_1B_2, B_1\rightarrow a, B_2\rightarrow b AB1B2,B1a,B2b

4.3 PDA

NFA is a PDA without stack.

Pushdown Automata
The class of languages that can be accepted by pushdown automata is exactly the class of context-free languages (finite automata are for regular languages).

  • The input for a pushdown automaton is a string w w w in Σ ∗ \Sigma^* Σ
  • Different from finite automata, PDAs have a single stack.
  • Stack have 2 different operations:
    • push: adds item to top of stack
    • pop: removes item from top of stack

在这里插入图片描述

  • Tape: divided into cells that store symbols belonging to Σ ϵ = Σ ∪ { ϵ } \Sigma_\epsilon=\Sigma\cup\{\epsilon\} Σϵ=Σ{ϵ}
  • Tape head: move along the tape, one cell to the right per move.
  • Stack: containing symbols from a finite set Γ Γ Γ, called the stack alphabet. This set contains a special symbol $ (often mark bottom of stack).
  • State control: can be in any one of a finite number of states. The set of states is denoted by Q Q Q. The set Q Q Q contains one special state q, called the start state.

PDA Transition
If PDA

  • in state q i q_i qi
  • reads a ∈ Σ ϵ a\in\Sigma_\epsilon aΣϵ
  • pops b ∈ Γ ϵ b\inΓ_\epsilon bΓϵ off the stack

If a = ϵ a=\epsilon a=ϵ, then no input symbol is read.
If b = ϵ b=\epsilon b=ϵ, then nothing is popped off stack.

Then PDA

  • moves to state q j q_j qj
  • push c ∈ Γ ϵ c\inΓ_\epsilon cΓϵ onto top of stack

If c = ϵ c=\epsilon c=ϵ, then nothing is pushed onto stack
If c = u 1 u 2 ⋯ u k c=u_1u_2\cdots u_k c=u1u2uk with k ≥ 1 k\geq1 k1 and u 1 , u 2 , ⋯   , u k ∈ Γ u_1,u_2,\cdots,u_k\inΓ u1,u2,,ukΓ, then b b b is replaced by c c c, and u k u_k uk becomes the new top symbol of the stack.

A pushdown automaton is a 6-tuple M = ( Q , Σ , Γ , δ , q , F ) M=(Q,\Sigma,Γ,\delta,q,F) M=(Q,Σ,Γ,δ,q,F)

  • Q Q Q is finite set of states
  • Σ \Sigma Σ is (finite) input (tape) alphabet
  • Γ Γ Γ is (finite) stack alphabet
  • δ \delta δ is the transition function: Q × Σ ϵ × Γ → Q × { N , R } × Γ ϵ ∗ Q\times\Sigma_\epsilon\timesΓ\rightarrow Q\times\{N,R\}\timesΓ_\epsilon^* Q×Σϵ×ΓQ×{N,R}×Γϵ
  • q ∈ Q q\in Q qQ is start state
  • F ⊆ Q F\subseteq Q FQ is set of accept states

Let r ′ ∈ Q , σ ∈ { N , R } r^\prime\in Q,\sigma\in\{N,R\} rQ,σ{N,R}, and w ∈ Γ ∗ w\inΓ^* wΓ δ ( r , a , b ) = ( r ′ , σ , c ) \delta(r,a,b)=(r^\prime,\sigma,c) δ(r,a,b)=(r,σ,c)

The tape head moves according to σ \sigma σ:

  • If σ = R \sigma=R σ=R, it moves one cell to the right
  • If σ = N \sigma=N σ=N, it does not move

4.3.1 Nondeterministic PDA

PDA transition function allows for nondeterminism δ : Q × Σ ϵ × Γ ϵ → P ( Q × Γ ϵ ) \delta:Q\times\Sigma_\epsilon\timesΓ_\epsilon\rightarrow P(Q\timesΓ_\epsilon) δ:Q×Σϵ×ΓϵP(Q×Γϵ)

4.3.2 Language accepted by PDA

The set of all input strings that are accepted by PDA M M M is the language recognized by M M M and is denoted by L ( M ) L(M) L(M)


4.4 Equivalence of PDA and context-free languages

Let Σ \Sigma Σ be an alphabet and let A ⊆ Σ ∗ A\subseteq\Sigma^* AΣ be a language. Then A A A is context-free if and only if there exists a pushdown automaton that accepts A A A.

  • If A = L ( G ) A=L(G) A=L(G) for some CFG G G G, then A = L ( M ) A=L(M) A=L(M) for some PDA M M M.
  • If A = L ( M ) A=L(M) A=L(M) for some PDA M M M, then A = L ( G ) A=L(G) A=L(G) for some CFG G G G.

Proof: If A = L ( G ) A=L(G) A=L(G) for some CFG G G G, then A = L ( M ) A=L(M) A=L(M) for some PDA M M M.

Basic idea: Given CFG G G G, convert it into PDA M M M with L ( M ) = L ( G ) L(M)=L(G) L(M)=L(G) by building PDA that simulates a leftmost derivation.

However, PDA cannot push strings instead of ≤ 1 \le1 1 symbols onto stack. How can we solve this problem? δ : Q × Σ ϵ × Γ ϵ → P ( Q × Γ ϵ ) \delta:Q\times\Sigma_\epsilon\timesΓ_\epsilon\rightarrow P(Q\timesΓ_\epsilon) δ:Q×Σϵ×ΓϵP(Q×Γϵ)

4.4.1 CFLs and regular languages

If A A A is a regular language, then A A A is also a CFL.


4.5 The pumping lemma for context-free languages

4.5.1 Pumping Lemma for CFLs

Let L L L be a context-free language. Then there exists an integer p ≥ 1 p\ge1 p1, called the pumping length, such that the following holds: Every string s in L L L, with ∣ s ∣ ≥ p |s|\ge p sp, can be written as s = u v x y z s=uvxyz s=uvxyz, such that

  • ∣ v y ∣ ≥ 1 |vy|\ge1 vy1
  • ∣ v x y ∣ ≤ p |vxy|\le p vxyp
  • u v i x y i z ∈ L uv^ixy^iz\in L uvixyizL, for all i ≥ 0 i\ge0 i0.

Split String Using Parse Tree

  • More generally, consider “long” string s ∈ A s\in A sA.
  • Parse tree is “tall”, ∃ ∃ repeated variable R R R in path from root S S S to leaf.
  • Split string s = u v x y z s=uvxyz s=uvxyz into 5 pieces based on repeated variable R R R:
    • u u u is before R − R R-R RR subtree (in depth-first order)
    • v v v is before second R R R subtree within R − R R-R RR subtree
    • x x x is what second R R R eventually becomes
    • y y y is after second R R R within R − R R-R RR subtree
    • z z z is after R − R R-R RR subtree
《数学函数计算手册》是一本专门介绍数学函数计算的手册。数学函数是数学中非常重要的概念,广泛应用于各个领域的科学研究和工程实践中。 这本手册系统地介绍了各种数学函数的计算方法和应用技巧。首先,它详细讲解了各类函数的定义、性质、特点和应用范围,包括常见的代数函数、三角函数、指数函数、对数函数等等。然后,它通过数学推导和实际案例,讲解了这些函数的计算方法和数值逼近方法,帮助读者更好地理解和掌握这些函数的计算技巧。 《数学函数计算手册》还提供了大量的数学函数计算实例和习题,让读者通过实践应用巩固和提高自己的计算能力。这些实例和习题覆盖了不同难度和复杂度的问题,涵盖了从基础的函数图像绘制和函数值计算,到高级的函数求导和解方程的方法。通过完成这些实例和习题,读者可以更深入地理解和掌握数学函数的计算原理和方法。 这本手册还提供了一些常用的数学函数表和计算工具,方便读者在实际问题中快速查阅和使用。同时,它还对数学函数的计算误差和稳定性进行了介绍,提供了一些规避和解决计算中常见问题的技巧和方法。 总之,该手册是一本全面介绍数学函数计算的重要参考书,对读者学习和应用数学函数具有很高的指导价值。无论是数学学习者、科研工作者还是工程技术人员,都可以从中获得丰富的知识和实践经验,提高自己的数学计算能力。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

SP FA

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值