引言
给定模型 λ = ( A , B , π ) \lambda = (A, B, \pi) λ=(A,B,π),计算观测序列 O ( o i , o 2 , ⋯ , o T ) O(o_i, o_2, \cdots, o_T) O(oi,o2,⋯,oT) 的出现概率 P ( O ∣ λ ) P(O|\lambda) P(O∣λ),是隐马尔科夫模型(HMM)能解决的基本问题之一。
1. 直接计算法
容易想到,解决此问题的最直观方法是直接按概率公式计算 P ( O ∣ λ ) P(O|\lambda) P(O∣λ)。该方法首先穷举所有可能的状态序列组合 S = ( s 1 , s 2 , ⋯ , s T ) S = (s_1, s_2, \cdots, s_T) S=(s1,s2,⋯,sT),然后求每种可能的状态序列与观测序列共同出现的联合概率 P ( O , S ∣ λ ) P(O, S | \lambda) P(O,S∣λ)。
对于长度为T的序列,所有可能的状态序列组合共有 ∏ 1 T C N 1 = N T \prod_{1}^{T} C_{N}^{1} = N^T ∏1TCN1=NT 种;然后根据观测独立性假设,每种状态序列产生给定观测序列的概率等于 ∏ t = 1 T P ( o t ∣ s t ) \prod_{t=1}^{T}P(o_t|s_t) ∏t=1TP(ot∣st),所以 P ( O , S ∣ λ ) P(O, S | \lambda) P(O,S∣λ) 计算的时间复杂度是 O ( T N T ) O(TN^T) O(TNT),这在工程实践中难以被接受。
由此,提出了基于动态规划的前向算法(forward algorithm)和后向算法(backward algorithm)改进方案。
2. 前向算法
给定隐马尔科夫模型
λ
\lambda
λ,定义在
t
t
t 时刻状态为
q
i
q_i
qi 且到
t
t
t 时刻部分观测序列为
(
o
1
,
⋯
,
o
t
)
(o_1, \cdots, o_t)
(o1,⋯,ot) 的概率为前向概率,记作:
α
t
(
q
i
)
=
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
∣
λ
)
(2.1)
\alpha_t(q_i) = P(s_t = q_i, o_1, \cdots, o_t | \lambda) \tag{2.1}
αt(qi)=P(st=qi,o1,⋯,ot∣λ)(2.1)
根据条件概率公式,
P
(
s
t
,
o
1
,
⋯
,
o
t
∣
λ
)
=
P
(
s
t
,
o
1
,
⋯
,
o
t
,
λ
)
/
P
(
λ
)
P(s_t, o_1, \cdots, o_t | \lambda) = P(s_t, o_1, \cdots, o_t, \lambda)/P(\lambda)
P(st,o1,⋯,ot∣λ)=P(st,o1,⋯,ot,λ)/P(λ),且对于给定模型参数
P
(
λ
)
P(\lambda)
P(λ) 必然等于1、
λ
\lambda
λ 是一个必然事件,因此不妨将
式
1
式1
式1公式简写为:
α
t
(
q
i
)
=
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
(2.2)
\alpha_t(q_i) = P(s_t = q_i, o_1, \cdots, o_t) \tag{2.2}
αt(qi)=P(st=qi,o1,⋯,ot)(2.2)
动态规划的核心是状态转移方程,我们可知前向算法的状态转移方程必然写成:前一时刻序列状态的概率乘以某一项后等于后一时刻序列状态的概率形式,我们不妨把它写成如下所示:
P ( s t , o 1 , ⋯ , o t ) = F u n ( ) P ( s t − 1 , o 1 , ⋯ , o t − 1 ) (2.3) P(s_t, o_1, \cdots, o_t) = Fun()P(s_{t-1}, o_1, \cdots, o_{t-1}) \tag{2.3} P(st,o1,⋯,ot)=Fun()P(st−1,o1,⋯,ot−1)(2.3)
由 式 2.3 式2.3 式2.3可见,前后时刻间序列状态的概率仅相差一个变量 s t s_t st 和一个常数项 o t o_t ot,我们不妨借助边缘概率性质将 t t t 时刻序列状态的概率 P ( s t , o 1 , ⋯ , o t ) P(s_t, o_1, \cdots, o_t) P(st,o1,⋯,ot) 改写成如下所示:
P ( s t , o 1 , ⋯ , o t ) = ∑ j = 1 N P ( s t − 1 = q j , s t , o 1 , ⋯ , o t − 1 , o t ) (2.4) P(s_t, o_1, \cdots, o_t) = \sum_{j=1}^{N} P(s_{t-1} = q_j, s_t, o_1, \cdots, o_{t-1}, o_t) \tag{2.4} P(st,o1,⋯,ot)=j=1∑NP(st−1=qj,st,o1,⋯,ot−1,ot)(2.4)
根据概率论中的链式法则(条件概率公式的推广),可将联合概率依次拆解成多个条件概率的乘积,所以
式
2.4
式2.4
式2.4中改写后的联合概率可拆解成如下形式:
P
(
s
t
−
1
=
q
j
,
s
t
,
o
1
,
⋯
,
o
t
−
1
,
o
t
)
=
P
(
s
t
−
1
=
q
j
,
o
1
,
⋯
,
o
t
−
1
)
P
(
s
t
∣
s
t
−
1
=
q
j
,
o
1
,
⋯
,
o
t
−
1
)
P
(
o
t
∣
s
t
−
1
=
q
j
,
s
t
,
o
1
,
⋯
,
o
t
−
1
)
P(s_{t-1} = q_j, s_t, o_1, \cdots, o_{t-1}, o_t) = P(s_{t-1} = q_j, o_1, \cdots, o_{t-1}) P(\ s_t\ | s_{t-1} = q_j, o_1, \cdots, o_{t-1}) P(\ o_t \ | s_{t-1} = q_j, s_t, o_1, \cdots, o_{t-1})
P(st−1=qj,st,o1,⋯,ot−1,ot)=P(st−1=qj,o1,⋯,ot−1)P( st ∣st−1=qj,o1,⋯,ot−1)P( ot ∣st−1=qj,st,o1,⋯,ot−1)
根据隐马尔科夫模型(HMM)的齐次马尔科夫性假设,式中
P
(
s
t
∣
s
t
−
1
=
q
j
,
o
1
,
⋯
,
o
t
−
1
)
=
P
(
s
t
∣
s
t
−
1
=
q
j
)
P(\ s_t\ | s_{t-1} = q_j, o_1, \cdots, o_{t-1}) = P(\ s_t\ | s_{t-1} = q_j)
P( st ∣st−1=qj,o1,⋯,ot−1)=P( st ∣st−1=qj);又根据隐马尔科夫模型(HMM)的观测独立性假设,式中
P
(
o
t
∣
s
t
−
1
=
q
j
,
s
t
,
o
1
,
⋯
,
o
t
−
1
)
=
P
(
o
t
∣
s
t
)
P(\ o_t \ | s_{t-1} = q_j, s_t, o_1, \cdots, o_{t-1}) = P(\ o_t \ | s_t)
P( ot ∣st−1=qj,st,o1,⋯,ot−1)=P( ot ∣st)。所以
式
2.4
式2.4
式2.4可简化为:
P ( s t , o 1 , ⋯ , o t ) = ∑ j = 1 N P ( s t − 1 = q j , o 1 , ⋯ , o t − 1 ) P ( s t ∣ s t − 1 = q j ) P ( o t ∣ s t ) (2.5) P(s_t, o_1, \cdots, o_t) = \sum_{j=1}^{N} P(s_{t-1} = q_j, o_1, \cdots, o_{t-1})P(\ s_t\ | s_{t-1} = q_j)P(\ o_t \ | s_t) \tag{2.5} P(st,o1,⋯,ot)=j=1∑NP(st−1=qj,o1,⋯,ot−1)P( st ∣st−1=qj)P( ot ∣st)(2.5)
带入模型参数后,前向概率表达式可表示为:
α t ( q i ) = [ ∑ j = 1 N α t − 1 ( q j ) a j i ] b i ( o t ) , t = 2 , 3 , ⋯ , T (2.6) \alpha_t(q_i) = \begin{bmatrix} \sum_{j=1}^{N} \alpha_{t-1}(q_j)a_{ji} \end{bmatrix} b_i(o_t) , \ \ \ \ \ t=2, 3, \cdots, T \tag{2.6} αt(qi)=[∑j=1Nαt−1(qj)aji]bi(ot), t=2,3,⋯,T(2.6)
所以,给定隐马尔科夫模型
λ
\lambda
λ,观测序列的出现概率为:
P
(
O
∣
λ
)
=
∑
i
=
1
N
α
T
(
q
i
)
,
α
1
(
q
i
)
=
π
i
b
i
(
o
1
)
(2.7)
P(O|\lambda) = \sum_{i=1}^{N} \alpha_T(q_i), \ \ \ \ \ \ \alpha_1(q_i) = \pi_ib_i(o_1) \tag{2.7}
P(O∣λ)=i=1∑NαT(qi), α1(qi)=πibi(o1)(2.7)
3. 后向算法
给定隐马尔科夫模型
λ
\lambda
λ,定义在
t
t
t 时刻状态为
q
i
q_i
qi 的条件下,从
t
+
1
t+1
t+1 到
T
T
T 时刻的部分观测序列为
(
o
t
+
1
,
⋯
,
o
T
)
(o_{t+1}, \cdots, o_T)
(ot+1,⋯,oT) 的概率为后向概率,记作:
β
t
(
q
i
)
=
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
,
λ
)
(3.1)
\beta_t(q_i) = P(o_{t+1}, \cdots, o_T | s_t = q_i, \lambda) \tag{3.1}
βt(qi)=P(ot+1,⋯,oT∣st=qi,λ)(3.1)
与前向算法同理,后向算法的状态转移方程为:
β
t
(
q
i
)
=
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
(3.2)
\beta_t(q_i) = P(o_{t+1}, \cdots, o_T | s_t = q_i) \tag{3.2}
βt(qi)=P(ot+1,⋯,oT∣st=qi)(3.2)
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
=
F
u
n
(
)
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
+
1
=
q
j
)
(3.3)
P(o_{t+1}, \cdots, o_T | s_t = q_i) = Fun()P(o_{t+2}, \cdots, o_T | s_{t+1} = q_j) \tag{3.3}
P(ot+1,⋯,oT∣st=qi)=Fun()P(ot+2,⋯,oT∣st+1=qj)(3.3)
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
=
∑
j
=
1
N
P
(
s
t
+
1
=
q
j
,
o
t
+
1
,
o
t
+
2
,
⋯
,
o
T
∣
s
t
=
q
i
)
(3.4)
P(o_{t+1}, \cdots, o_T | s_t = q_i) = \sum_{j=1}^{N} P(s_{t+1} = q_j, o_{t+1}, o_{t+2}, \cdots, o_T | s_t = q_i) \tag{3.4}
P(ot+1,⋯,oT∣st=qi)=j=1∑NP(st+1=qj,ot+1,ot+2,⋯,oT∣st=qi)(3.4)
{
P
(
s
t
+
1
,
o
t
+
1
,
o
t
+
2
,
⋯
,
o
T
∣
s
t
)
=
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
,
s
t
+
1
,
o
t
+
1
)
P
(
o
t
+
1
∣
s
t
,
s
t
+
1
)
P
(
s
t
+
1
∣
s
t
)
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
,
s
t
+
1
,
o
t
+
1
)
=
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
+
1
)
P
(
o
t
+
1
∣
s
t
,
s
t
+
1
)
=
P
(
o
t
+
1
∣
s
t
+
1
)
(3.5)
\begin{cases} P(s_{t+1}, o_{t+1}, o_{t+2}, \cdots, o_T | s_t) = P(o_{t+2}, \cdots, o_T | s_t, s_{t+1}, o_{t+1})P(o_{t+1} | s_t, s_{t+1})P(s_{t+1}| s_t) \\ P(o_{t+2}, \cdots, o_T | s_t, s_{t+1}, o_{t+1}) = P(o_{t+2}, \cdots, o_T | s_{t+1}) \\ P(o_{t+1} | s_t, s_{t+1}) = P(o_{t+1} | s_{t+1}) \end{cases} \tag{3.5}
⎩⎪⎨⎪⎧P(st+1,ot+1,ot+2,⋯,oT∣st)=P(ot+2,⋯,oT∣st,st+1,ot+1)P(ot+1∣st,st+1)P(st+1∣st)P(ot+2,⋯,oT∣st,st+1,ot+1)=P(ot+2,⋯,oT∣st+1)P(ot+1∣st,st+1)=P(ot+1∣st+1)(3.5)
β
t
(
q
i
)
=
∑
j
=
1
N
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
+
1
=
q
j
)
P
(
o
t
+
1
∣
s
t
+
1
=
q
j
)
P
(
s
t
+
1
=
q
j
∣
s
t
=
q
i
)
(3.6)
\beta_t(q_i) = \sum_{j=1}^{N} P(o_{t+2}, \cdots, o_T | s_{t+1}=q_j)P(o_{t+1} | s_{t+1}=q_j)P(s_{t+1}=q_j| s_t=q_i) \tag{3.6}
βt(qi)=j=1∑NP(ot+2,⋯,oT∣st+1=qj)P(ot+1∣st+1=qj)P(st+1=qj∣st=qi)(3.6)
β
t
(
q
i
)
=
∑
j
=
1
N
β
t
+
1
(
q
j
)
b
j
(
o
t
+
1
)
a
i
j
(3.7)
\beta_t(q_i) = \sum_{j=1}^{N} \beta_{t+1}(q_j)b_j(o_{t+1})a_{ij} \tag{3.7}
βt(qi)=j=1∑Nβt+1(qj)bj(ot+1)aij(3.7)
P
(
O
∣
λ
)
=
∑
i
=
1
N
π
i
b
i
(
o
1
)
β
1
(
q
i
)
,
β
T
(
q
i
)
=
1
(3.8)
P(O|\lambda) = \sum_{i=1}^{N} \pi_i b_i(o_1) \beta_1(q_i) , \ \ \ \ \ \ \beta_T(q_i) = 1 \tag{3.8}
P(O∣λ)=i=1∑Nπibi(o1)β1(qi), βT(qi)=1(3.8)
4. 序列中间时刻状态概率计算
由前向算法和后向算法,可以推出序列中间某一时刻处于状态 q i q_i qi 的概率 γ t ( q i ) \gamma_t(q_i) γt(qi),和序列中间相邻的某两时刻分别处于状态 q i q_i qi、 q j q_j qj 的概率 ξ t ( q i , q j ) \xi_t(q_i, q_j) ξt(qi,qj),这称作F/B算法(Forward / Backward Algorithm)。
4.1 序列中某一时刻所处状态的概率
将
γ
t
(
q
i
)
\gamma_t(q_i)
γt(qi) 记作序列中某一时刻所处状态的概率,其表达式为:
γ
t
(
q
i
)
=
P
(
s
t
=
q
i
∣
O
,
λ
)
(4.1.1)
\gamma_t(q_i) = P(s_t = q_i | O, \lambda) \tag{4.1.1}
γt(qi)=P(st=qi∣O,λ)(4.1.1)
γ
t
(
q
i
)
=
P
(
s
t
=
q
i
,
O
,
λ
)
P
(
O
∣
λ
)
(4.1.2)
\gamma_t(q_i) = \frac{P(s_t = q_i, O, \lambda)}{P(O | \lambda)} \tag{4.1.2}
γt(qi)=P(O∣λ)P(st=qi,O,λ)(4.1.2)
γ
t
(
q
i
)
=
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
∑
i
=
1
N
P
(
s
t
=
q
i
,
O
)
(4.1.3)
\gamma_t(q_i) = \frac{ P(o_{t+1}, \cdots, o_T | s_t = q_i, o_1, \cdots, o_t)P(s_t = q_i, o_1, \cdots, o_t) }{ \sum_{i=1}^{N} P(s_t = q_i, O) } \tag{4.1.3}
γt(qi)=∑i=1NP(st=qi,O)P(ot+1,⋯,oT∣st=qi,o1,⋯,ot)P(st=qi,o1,⋯,ot)(4.1.3)
∵
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
=
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
(4.1.4)
\because P(o_{t+1}, \cdots, o_T | s_t = q_i, o_1, \cdots, o_t) = P(o_{t+1}, \cdots, o_T | s_t = q_i) \tag{4.1.4}
∵P(ot+1,⋯,oT∣st=qi,o1,⋯,ot)=P(ot+1,⋯,oT∣st=qi)(4.1.4)
∴
γ
t
(
q
i
)
=
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
∑
i
=
1
N
P
(
s
t
=
q
i
,
O
)
(4.1.5)
\therefore \gamma_t(q_i) = \frac{ P(o_{t+1}, \cdots, o_T | s_t = q_i)P(s_t = q_i, o_1, \cdots, o_t) }{ \sum_{i=1}^{N} P(s_t = q_i, O) } \tag{4.1.5}
∴γt(qi)=∑i=1NP(st=qi,O)P(ot+1,⋯,oT∣st=qi)P(st=qi,o1,⋯,ot)(4.1.5)
γ
t
(
q
i
)
=
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
∑
i
=
1
N
P
(
o
t
+
1
,
⋯
,
o
T
∣
s
t
=
q
i
)
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
(4.1.6)
\gamma_t(q_i) = \frac{ P(o_{t+1}, \cdots, o_T | s_t = q_i)P(s_t = q_i, o_1, \cdots, o_t) }{ \sum_{i=1}^{N} P(o_{t+1}, \cdots, o_T | s_t = q_i)P(s_t = q_i, o_1, \cdots, o_t) } \tag{4.1.6}
γt(qi)=∑i=1NP(ot+1,⋯,oT∣st=qi)P(st=qi,o1,⋯,ot)P(ot+1,⋯,oT∣st=qi)P(st=qi,o1,⋯,ot)(4.1.6)
γ
t
(
q
i
)
=
α
t
(
q
i
)
β
t
(
q
i
)
∑
i
=
1
N
α
t
(
q
i
)
β
t
(
q
i
)
(4.1.7)
\gamma_t(q_i) = \frac{ \alpha_t(q_i) \beta_t(q_i) }{ \sum_{i=1}^{N} \alpha_t(q_i) \beta_t(q_i) } \tag{4.1.7}
γt(qi)=∑i=1Nαt(qi)βt(qi)αt(qi)βt(qi)(4.1.7)
4.2 序列中某两相邻时刻所处状态的概率
将
ξ
t
(
q
i
,
q
j
)
\xi_t(q_i, q_j)
ξt(qi,qj) 记作序列中间某两相邻时刻所处状态的概率,其表达式为:
ξ
t
(
q
i
,
q
j
)
=
P
(
s
t
=
q
i
,
s
t
+
1
=
q
j
∣
O
,
λ
)
(4.2.1)
\xi_t(q_i, q_j) = P(s_t = q_i, s_{t+1}=q_j | O, \lambda) \tag{4.2.1}
ξt(qi,qj)=P(st=qi,st+1=qj∣O,λ)(4.2.1)
ξ
t
(
q
i
,
q
j
)
=
P
(
s
t
=
q
i
,
s
t
+
1
=
q
j
,
O
,
λ
)
P
(
O
∣
λ
)
(4.2.2)
\xi_t(q_i, q_j) = \frac{ P(s_t = q_i, s_{t+1}=q_j, O, \lambda) }{ P(O | \lambda) } \tag{4.2.2}
ξt(qi,qj)=P(O∣λ)P(st=qi,st+1=qj,O,λ)(4.2.2)
ξ
t
(
q
i
,
q
j
)
=
P
(
s
t
=
q
i
,
s
t
+
1
=
q
j
,
O
)
∑
i
=
1
N
∑
j
=
1
N
P
(
s
t
=
q
i
,
s
t
+
1
=
q
j
,
O
)
(4.2.3)
\xi_t(q_i, q_j) = \frac{ P(s_t = q_i, s_{t+1}=q_j, O) }{ \sum_{i=1}^{N} \sum_{j=1}^{N} P(s_t = q_i, s_{t+1}=q_j, O) } \tag{4.2.3}
ξt(qi,qj)=∑i=1N∑j=1NP(st=qi,st+1=qj,O)P(st=qi,st+1=qj,O)(4.2.3)
∵
{
P
(
s
t
=
q
i
,
s
t
+
1
=
q
j
,
O
)
=
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
P
(
s
t
+
1
=
q
j
∣
s
t
=
q
i
)
P
(
o
t
+
1
∣
s
t
+
1
=
q
j
)
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
+
1
=
q
j
)
P
(
s
t
=
q
i
,
o
1
,
⋯
,
o
t
)
=
α
t
(
q
i
)
P
(
s
t
+
1
=
q
j
∣
s
t
=
q
i
)
=
a
i
j
P
(
o
t
+
1
∣
s
t
+
1
=
q
j
)
=
b
j
(
o
t
+
1
)
P
(
o
t
+
2
,
⋯
,
o
T
∣
s
t
+
1
=
q
j
)
=
β
t
+
1
(
q
i
)
(4.2.4)
\because\begin{cases} P(s_t = q_i, s_{t+1}=q_j, O) = P(s_t = q_i, o_1, \cdots, o_t)P(s_{t+1} = q_j|s_t = q_i)P(o_{t+1}|s_{t+1} = q_j) P(o_{t+2}, \cdots, o_T | s_{t+1} = q_j) \\ P(s_t = q_i, o_1, \cdots, o_t) = \alpha_t(q_i) \\ P(s_{t+1} = q_j|s_t = q_i) = a_{ij} \\ P(o_{t+1}|s_{t+1} = q_j) = b_j(o_{t+1}) \\ P(o_{t+2}, \cdots, o_T | s_{t+1} = q_j) = \beta_{t+1}(q_i) \end{cases} \tag{4.2.4}
∵⎩⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎧P(st=qi,st+1=qj,O)=P(st=qi,o1,⋯,ot)P(st+1=qj∣st=qi)P(ot+1∣st+1=qj)P(ot+2,⋯,oT∣st+1=qj)P(st=qi,o1,⋯,ot)=αt(qi)P(st+1=qj∣st=qi)=aijP(ot+1∣st+1=qj)=bj(ot+1)P(ot+2,⋯,oT∣st+1=qj)=βt+1(qi)(4.2.4)
∴
ξ
t
(
q
i
,
q
j
)
=
α
t
(
q
i
)
a
i
j
b
j
(
o
t
+
1
)
β
t
+
1
(
q
i
)
∑
i
=
1
N
∑
j
=
1
N
α
t
(
q
i
)
a
i
j
b
j
(
o
t
+
1
)
β
t
+
1
(
q
i
)
(4.2.5)
\therefore \xi_t(q_i, q_j) = \frac{ \alpha_t(q_i)a_{ij}b_j(o_{t+1})\beta_{t+1}(q_i) }{ \sum_{i=1}^{N} \sum_{j=1}^{N} \alpha_t(q_i)a_{ij}b_j(o_{t+1})\beta_{t+1}(q_i) } \tag{4.2.5}
∴ξt(qi,qj)=∑i=1N∑j=1Nαt(qi)aijbj(ot+1)βt+1(qi)αt(qi)aijbj(ot+1)βt+1(qi)(4.2.5)