前言
上一篇HMM高斯混合模型参数估计我们对learning问题进行了数学推导,但是由于本身隐马尔可夫模型的复杂性。即便算出了迭代式的公式,等号右边的概率该如何计算仍然式一个问题
数学基础:【概率论与数理统计知识复习-哔哩哔哩】
数学推导
Evaluation:
该问题就是要求出 P ( X ∣ θ ) P(X|\theta) P(X∣θ)。
对于该问题,前面我们求Learning问题时,曾求出过
P
(
X
∣
θ
)
=
∑
Z
π
∏
i
=
1
T
−
1
a
(
i
,
i
+
1
)
∏
j
=
1
T
b
(
j
,
j
)
P(X|\theta)=\sum\limits_{Z}\pi\prod\limits_{i=1}^{T-1}a_{(i,i+1)}\prod\limits_{j=1}^{T}b_{(j,j)}
P(X∣θ)=Z∑πi=1∏T−1a(i,i+1)j=1∏Tb(j,j)
这种解法的复杂度太高。因此引入复杂度相对较低的算法(前向算法和后向算法)。
前向算法(递推式算法)
P ( X ∣ θ ) = P ( x 1 , x 2 , ⋯ , x T ∣ θ ) = ∑ z T P ( x 1 , x 2 , ⋯ , x T , z T ∣ θ ) = ∑ i = 1 n P ( x 1 , x 2 , ⋯ , x T , z T = q i ∣ θ ) \begin{equation} \begin{aligned} P(X|\theta)=&P(x_1,x_2,\cdots,x_T|\theta) \\=&\sum\limits_{z_T}P(x_1,x_2,\cdots,x_T,z_T|\theta) \\=&\sum\limits_{i=1}^nP(x_1,x_2,\cdots,x_T,z_T=q_i|\theta) \end{aligned} \end{equation} P(X∣θ)===P(x1,x2,⋯,xT∣θ)zT∑P(x1,x2,⋯,xT,zT∣θ)i=1∑nP(x1,x2,⋯,xT,zT=qi∣θ)
令 α ( z t = q i ) = P ( x 1 , x 2 , ⋯ , x t , z t = q i ∣ θ ) \alpha_{(z_t=q_i)}=P(x_1,x_2,\cdots,x_t,z_t=q_i|\theta) α(zt=qi)=P(x1,x2,⋯,xt,zt=qi∣θ)。注意该式子就是上面learning问题时所得。
找出其递推式,对于
α
(
z
t
+
1
=
q
i
)
\alpha_{(z_{t+1}=q_i)}
α(zt+1=qi)
α
(
z
t
+
1
=
q
i
)
=
P
(
x
1
,
x
2
,
⋯
,
x
t
,
x
t
+
1
,
z
t
+
1
=
q
i
∣
θ
)
=
P
(
x
t
+
1
∣
x
1
,
⋯
x
t
,
z
t
+
1
=
q
i
,
θ
)
P
(
x
1
,
⋯
x
t
,
z
t
+
1
=
q
i
∣
θ
)
=
P
(
x
t
+
1
∣
z
t
+
1
=
q
i
)
P
(
x
1
,
⋯
x
t
,
z
t
+
1
∣
θ
)
=
b
(
z
t
+
1
=
q
i
,
x
t
+
1
)
∑
z
t
P
(
x
1
,
⋯
x
t
,
z
t
,
z
t
+
1
=
q
i
∣
θ
)
=
b
(
z
t
+
1
=
q
i
,
x
t
+
1
)
∑
j
=
1
n
P
(
x
1
,
⋯
x
t
,
z
t
=
q
j
,
z
t
+
1
=
q
i
∣
θ
)
=
b
(
z
t
+
1
=
q
i
,
x
t
+
1
)
∑
j
=
1
n
P
(
z
t
+
1
=
q
i
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
j
,
θ
)
P
(
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
j
∣
θ
)
=
b
(
z
t
+
1
=
q
i
,
x
t
+
1
)
∑
j
=
1
n
P
(
z
t
+
1
=
q
i
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
j
,
θ
)
P
(
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
j
∣
θ
)
=
∑
j
=
1
n
b
(
z
t
+
1
=
q
i
,
x
t
+
1
)
a
(
z
t
=
q
j
,
z
t
+
1
=
q
i
)
α
(
z
t
=
q
j
)
\begin{equation} \begin{aligned} \alpha_{(z_{t+1}=q_i)}=&P(x_1,x_2,\cdots,x_t,x_{t+1},z_{t+1}=q_i|\theta) \\=&P(x_{t+1}|x_1,\cdots{x_t,z_{t+1}=q_i,\theta})P(x_1,\cdots{x_t,z_{t+1}=q_i|\theta}) \\=&P(x_{t+1}|z_{t+1}=q_i)P(x_1,\cdots{x_t,z_{t+1}|\theta}) \\=&b_{(z_{t+1}=q_i,x_{t+1})}\sum\limits_{z_t}P(x_1,\cdots{x_t,z_t,z_{t+1}=q_i|\theta}) \\=&b_{(z_{t+1}=q_i,x_{t+1})}\sum\limits_{j=1}^nP(x_1,\cdots{x_t,z_t=q_j,z_{t+1}=q_i|\theta}) \\=&b_{(z_{t+1}=q_i,x_{t+1})}\sum\limits_{j=1}^nP(z_{t+1}=q_i|x_1,x_2,\cdots,x_t,z_t=q_j,\theta)P(x_1,x_2,\cdots,x_t,z_t=q_j|\theta) \\=&b_{(z_{t+1}=q_i,x_{t+1})}\sum\limits_{j=1}^nP(z_{t+1}=q_i|x_1,x_2,\cdots,x_t,z_t=q_j,\theta)P(x_1,x_2,\cdots,x_t,z_t=q_j|\theta) \\=&\sum\limits_{j=1}^nb_{(z_{t+1}=q_i,x_{t+1})}a_{(z_t=q_j,z_{t+1}=q_i)}\alpha_{(z_t=q_j)} \end{aligned} \end{equation}
α(zt+1=qi)========P(x1,x2,⋯,xt,xt+1,zt+1=qi∣θ)P(xt+1∣x1,⋯xt,zt+1=qi,θ)P(x1,⋯xt,zt+1=qi∣θ)P(xt+1∣zt+1=qi)P(x1,⋯xt,zt+1∣θ)b(zt+1=qi,xt+1)zt∑P(x1,⋯xt,zt,zt+1=qi∣θ)b(zt+1=qi,xt+1)j=1∑nP(x1,⋯xt,zt=qj,zt+1=qi∣θ)b(zt+1=qi,xt+1)j=1∑nP(zt+1=qi∣x1,x2,⋯,xt,zt=qj,θ)P(x1,x2,⋯,xt,zt=qj∣θ)b(zt+1=qi,xt+1)j=1∑nP(zt+1=qi∣x1,x2,⋯,xt,zt=qj,θ)P(x1,x2,⋯,xt,zt=qj∣θ)j=1∑nb(zt+1=qi,xt+1)a(zt=qj,zt+1=qi)α(zt=qj)
请注意区分
α
\alpha
α和
a
a
a
所以,我们找到了
α
(
z
t
=
q
i
)
(
x
t
)
\alpha_{(z_t=q_i)}(x_t)
α(zt=qi)(xt)和
α
(
z
t
+
1
=
q
i
)
(
x
t
+
1
)
\alpha_{(z_{t+1}=q_i)}(x_{t+1})
α(zt+1=qi)(xt+1)的关系式。然后就可以递归地完成实现上面的
P
(
X
∣
θ
)
P(X|\theta)
P(X∣θ)
P
(
X
∣
θ
)
=
∑
i
=
1
n
α
(
z
T
=
q
i
)
P(X|\theta)=\sum\limits_{i=1}^n\alpha_{(z_T=q_i)}
P(X∣θ)=i=1∑nα(zT=qi)
后向算法(递推式算法)
P ( X ∣ θ ) = P ( x 1 , x 2 , ⋯ , x T ∣ θ ) = ∑ z 1 P ( x 1 , x 2 , ⋯ , x T , z 1 ∣ θ ) = ∑ z 1 P ( x 1 ∣ x 2 , ⋯ , x T , z 1 , θ ) P ( x 2 , ⋯ , x T , z 1 ∣ θ ) = ∑ z 1 b ( z 1 , x 1 ) P ( x 2 , ⋯ , x T , z 1 ∣ θ ) = ∑ z 1 b ( z 1 , x 1 ) P ( x 2 , ⋯ , x T ∣ z 1 , θ ) P ( z 1 ∣ θ ) = ∑ i = 1 n b ( z 1 = q i , x 1 ) P ( x 2 , ⋯ , x T ∣ z 1 = q i , θ ) π i \begin{equation} \begin{aligned} P(X|\theta)=&P(x_1,x_2,\cdots,x_T|\theta) \\=&\sum\limits_{z_1}P(x_1,x_2,\cdots,x_T,z_1|\theta) \\=&\sum\limits_{z_1}P(x_1|x_2,\cdots,x_T,z_1,\theta)P(x_2,\cdots,x_T,z_1|\theta) \\=&\sum\limits_{z_1}b_{(z_1,x_1)}P(x_2,\cdots,x_T,z_1|\theta) \\=&\sum\limits_{z_1}b_{(z_1,x_1)}P(x_2,\cdots,x_T|z_1,\theta)P(z_1|\theta) \\=&\sum\limits_{i=1}^nb_{(z_1=q_i,x_1)}P(x_2,\cdots,x_T|z_1=q_i,\theta)\pi_i \end{aligned} \end{equation} P(X∣θ)======P(x1,x2,⋯,xT∣θ)z1∑P(x1,x2,⋯,xT,z1∣θ)z1∑P(x1∣x2,⋯,xT,z1,θ)P(x2,⋯,xT,z1∣θ)z1∑b(z1,x1)P(x2,⋯,xT,z1∣θ)z1∑b(z1,x1)P(x2,⋯,xT∣z1,θ)P(z1∣θ)i=1∑nb(z1=qi,x1)P(x2,⋯,xT∣z1=qi,θ)πi
令 β ( z t = q i ) = P ( x t + 1 , ⋯ , x T ∣ z t = q i , θ ) \beta_{(z_{t}=q_i)}=P(x_{t+1},\cdots,x_T|z_{t}=q_i,\theta) β(zt=qi)=P(xt+1,⋯,xT∣zt=qi,θ)。
找出其递推式,对于
β
(
z
t
−
1
=
q
i
)
\beta_{(z_{t-1}=q_i)}
β(zt−1=qi)
β
(
z
t
−
1
=
q
i
)
=
P
(
x
t
,
⋯
,
x
T
∣
z
t
−
1
=
q
i
,
θ
)
=
∑
z
t
P
(
x
t
,
⋯
,
x
T
,
z
t
∣
z
t
−
1
=
q
i
,
θ
)
=
∑
z
t
P
(
x
t
,
⋯
,
x
T
∣
z
t
,
z
t
−
1
=
q
i
,
θ
)
P
(
z
t
∣
z
t
−
1
=
q
i
,
θ
)
=
∑
z
t
P
(
x
t
,
⋯
,
x
T
∣
z
t
,
z
t
−
1
=
q
i
,
θ
)
a
(
z
t
−
1
=
q
i
,
z
t
)
=
∑
z
t
P
(
x
t
,
⋯
,
x
T
∣
z
t
,
θ
)
a
(
z
t
−
1
=
q
i
,
z
t
)
=
∑
z
t
P
(
x
t
∣
x
t
+
1
,
⋯
,
x
T
,
z
t
,
θ
)
P
(
x
t
+
1
,
⋯
,
x
T
∣
z
t
,
θ
)
a
(
z
t
−
1
=
q
i
,
z
t
)
=
∑
z
t
p
(
x
t
∣
z
t
)
P
(
x
t
+
1
,
⋯
,
x
T
∣
z
t
,
θ
)
a
(
z
t
−
1
=
q
i
,
z
t
)
=
∑
j
=
1
n
b
(
z
t
=
q
j
,
x
t
)
β
(
z
t
=
q
j
)
a
(
z
t
−
1
=
q
i
,
z
t
=
q
j
)
\begin{equation} \begin{aligned} \beta_{(z_{t-1}=q_i)}=&P(x_{t},\cdots,x_T|z_{t-1}=q_i,\theta) \\=&\sum\limits_{z_{t}}P(x_{t},\cdots,x_T,z_{t}|z_{t-1}=q_i,\theta) \\=&\sum\limits_{z_{t}}P(x_{t},\cdots,x_T|z_{t},z_{t-1}=q_i,\theta)P(z_{t}|z_{t-1}=q_i,\theta) \\=&\sum\limits_{z_{t}}P(x_{t},\cdots,x_T|z_{t},z_{t-1}=q_i,\theta)a_{(z_{t-1}=q_i,z_t)} \\=&\sum\limits_{z_{t}}P(x_{t},\cdots,x_T|z_{t},\theta)a_{(z_{t-1}=q_i,z_t)} \\=&\sum\limits_{z_{t}}P(x_t|x_{t+1},\cdots,x_T,z_t,\theta)P(x_{t+1},\cdots,x_T|z_t,\theta)a_{(z_{t-1}=q_i,z_t)} \\=&\sum\limits_{z_{t}}p(x_t|z_t)P(x_{t+1},\cdots,x_T|z_t,\theta)a_{(z_{t-1}=q_i,z_t)} \\=&\sum\limits_{j=1}^nb_{(z_t=q_j,x_t)}\beta_{(z_{t}=q_j)}a_{(z_{t-1}=q_i,z_t=q_j)} \end{aligned} \end{equation}
β(zt−1=qi)========P(xt,⋯,xT∣zt−1=qi,θ)zt∑P(xt,⋯,xT,zt∣zt−1=qi,θ)zt∑P(xt,⋯,xT∣zt,zt−1=qi,θ)P(zt∣zt−1=qi,θ)zt∑P(xt,⋯,xT∣zt,zt−1=qi,θ)a(zt−1=qi,zt)zt∑P(xt,⋯,xT∣zt,θ)a(zt−1=qi,zt)zt∑P(xt∣xt+1,⋯,xT,zt,θ)P(xt+1,⋯,xT∣zt,θ)a(zt−1=qi,zt)zt∑p(xt∣zt)P(xt+1,⋯,xT∣zt,θ)a(zt−1=qi,zt)j=1∑nb(zt=qj,xt)β(zt=qj)a(zt−1=qi,zt=qj)
这仍然是一个递归式算法。和前向算法一个道理。所以
P
(
X
∣
θ
)
=
∑
i
=
1
n
b
(
z
1
=
q
i
,
x
1
)
β
(
z
1
=
q
i
)
π
i
P(X|\theta)=\sum\limits_{i=1}^nb_{(z_1=q_i,x_1)}\beta_{(z_1=q_i)}\pi_i
P(X∣θ)=i=1∑nb(z1=qi,x1)β(z1=qi)πi
接下来就找出learning问题中如何对概率求解
P
(
z
t
=
q
i
,
X
∣
θ
t
)
=
P
(
z
t
=
q
i
,
x
1
,
x
2
,
⋯
,
x
T
∣
θ
t
)
=
P
(
x
t
+
1
,
⋯
x
T
∣
z
t
=
q
i
,
x
1
,
⋯
,
x
t
,
θ
t
)
P
(
z
t
=
q
i
,
x
1
,
⋯
,
x
t
∣
θ
t
)
=
P
(
x
t
+
1
,
⋯
x
T
∣
z
t
=
q
i
,
θ
t
)
P
(
z
t
=
q
i
,
x
1
,
⋯
,
x
t
∣
θ
t
)
=
β
(
z
t
=
q
i
)
α
(
z
t
=
q
i
)
\begin{equation} \begin{aligned} &P(z_t=q_i,X|\theta^t) \\=&P(z_t=q_i,x_1,x_2,\cdots,x_T|\theta^t) \\=&P(x_{t+1},\cdots x_T|z_t=q_i,x_1,\cdots,x_t,\theta^t)P(z_t=q_i,x_1,\cdots,x_t|\theta^t) \\=&P(x_{t+1},\cdots x_T|z_t=q_i,\theta^t)P(z_t=q_i,x_1,\cdots,x_t|\theta^t) \\=&\beta_{(z_t=q_i)}\alpha_{(z_t=q_i)} \end{aligned} \end{equation}
====P(zt=qi,X∣θt)P(zt=qi,x1,x2,⋯,xT∣θt)P(xt+1,⋯xT∣zt=qi,x1,⋯,xt,θt)P(zt=qi,x1,⋯,xt∣θt)P(xt+1,⋯xT∣zt=qi,θt)P(zt=qi,x1,⋯,xt∣θt)β(zt=qi)α(zt=qi)
从该问题可得到
P
(
z
1
=
q
i
,
X
∣
θ
t
)
P(z_1=q_i,X|\theta^t)
P(z1=qi,X∣θt)的求解。那么还剩下
∑
t
T
−
1
P
(
z
t
=
q
i
,
z
t
+
1
=
q
j
,
X
∣
θ
t
)
\sum\limits_{t}^{T-1}P(z_t=q_i,z_{t+1}=q_j,X|\theta^t)
t∑T−1P(zt=qi,zt+1=qj,X∣θt)
∑
t
T
−
1
P
(
z
t
=
q
i
,
z
t
+
1
=
q
j
,
X
∣
θ
t
)
=
P
(
x
1
,
x
2
,
⋯
,
x
T
,
z
t
=
q
i
,
z
t
+
1
=
q
j
∣
θ
t
)
=
P
(
z
t
+
1
=
q
j
,
x
t
+
1
,
⋯
,
x
T
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
P
(
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
∣
θ
t
)
=
P
(
z
t
+
1
=
q
j
,
x
t
+
1
,
⋯
,
x
T
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
α
(
z
t
=
q
i
)
=
P
(
x
t
+
2
,
⋯
,
x
T
∣
z
t
+
1
=
q
j
,
x
1
,
x
2
,
⋯
,
x
t
+
1
,
z
t
=
q
i
,
θ
t
)
P
(
x
t
+
1
,
z
t
+
1
=
q
j
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
α
(
z
t
=
q
i
)
=
P
(
x
t
+
2
,
⋯
,
x
T
∣
z
t
+
1
=
q
i
,
θ
t
)
P
(
x
t
+
1
,
z
t
+
1
=
q
j
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
α
(
z
t
=
q
i
)
=
β
(
z
t
+
1
=
q
j
)
P
(
x
t
+
1
,
z
t
+
1
=
q
j
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
α
(
z
t
=
q
i
)
=
β
(
z
t
+
1
=
q
j
)
P
(
x
t
+
1
∣
z
t
+
1
=
q
j
,
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
P
(
z
t
=
1
=
q
j
∣
x
1
,
x
2
,
⋯
,
x
t
,
z
t
=
q
i
,
θ
t
)
α
(
z
t
=
q
i
)
=
β
(
z
t
+
1
=
q
j
)
b
(
z
t
+
1
=
q
j
,
x
t
+
1
)
a
(
z
t
=
q
i
,
z
t
+
1
=
q
j
)
α
(
z
t
=
q
i
)
\begin{equation} \begin{aligned} &\sum\limits_{t}^{T-1}P(z_t=q_i,z_{t+1}=q_j,X|\theta^t) \\=&P(x_1,x_2,\cdots,x_T,z_t=q_i,z_{t+1}=q_j|\theta^t) \\=&P(z_{t+1}=q_j,x_{t+1},\cdots,x_T|x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)P(x_1,x_2,\cdots,x_t,z_t=q_i|\theta^t) \\=&P(z_{t+1}=q_j,x_{t+1},\cdots,x_T|x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)\alpha_{(z_t=q_i)} \\=&P(x_{t+2},\cdots,x_T|z_{t+1}={q_j},x_1,x_2,\cdots,x_{t+1},z_t=q_i,\theta^t)P(x_{t+1},z_{t+1}={q_j}|x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)\alpha_{(z_t=q_i)} \\=&P(x_{t+2},\cdots,x_T|z_{t+1}=q_i,\theta^t)P(x_{t+1},z_{t+1}={q_j}|x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)\alpha_{(z_t=q_i)} \\=&\beta_{(z_{t+1}=q_j)}P(x_{t+1},z_{t+1}={q_j}|x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)\alpha_{(z_t=q_i)} \\=&\beta_{(z_{t+1}=q_j)}P(x_{t+1}|z_{t+1}=q_j,x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)P(z_{t=1}=q_j|x_1,x_2,\cdots,x_t,z_t=q_i,\theta^t)\alpha_{(z_t=q_i)} \\=&\beta_{(z_{t+1}=q_j)}b_{(z_{t+1}=q_j,x_{t+1})}a_{(z_t=q_i,z_{t+1}=q_j)}\alpha_{(z_t=q_i)} \end{aligned} \end{equation}
========t∑T−1P(zt=qi,zt+1=qj,X∣θt)P(x1,x2,⋯,xT,zt=qi,zt+1=qj∣θt)P(zt+1=qj,xt+1,⋯,xT∣x1,x2,⋯,xt,zt=qi,θt)P(x1,x2,⋯,xt,zt=qi∣θt)P(zt+1=qj,xt+1,⋯,xT∣x1,x2,⋯,xt,zt=qi,θt)α(zt=qi)P(xt+2,⋯,xT∣zt+1=qj,x1,x2,⋯,xt+1,zt=qi,θt)P(xt+1,zt+1=qj∣x1,x2,⋯,xt,zt=qi,θt)α(zt=qi)P(xt+2,⋯,xT∣zt+1=qi,θt)P(xt+1,zt+1=qj∣x1,x2,⋯,xt,zt=qi,θt)α(zt=qi)β(zt+1=qj)P(xt+1,zt+1=qj∣x1,x2,⋯,xt,zt=qi,θt)α(zt=qi)β(zt+1=qj)P(xt+1∣zt+1=qj,x1,x2,⋯,xt,zt=qi,θt)P(zt=1=qj∣x1,x2,⋯,xt,zt=qi,θt)α(zt=qi)β(zt+1=qj)b(zt+1=qj,xt+1)a(zt=qi,zt+1=qj)α(zt=qi)
所以,最终在Learning中所得结果应当是
π
i
=
β
(
z
1
=
q
i
)
α
(
z
1
=
q
i
)
P
(
X
∣
θ
t
)
;
a
(
z
=
q
i
,
z
=
q
j
)
=
∑
t
=
1
T
−
1
β
(
z
t
+
1
=
q
j
)
b
(
z
t
+
1
=
q
j
,
x
t
+
1
)
a
(
z
t
=
q
i
,
z
t
+
1
=
q
j
)
α
(
z
t
=
q
i
)
∑
t
=
1
T
−
1
β
(
z
t
=
q
i
)
α
(
z
t
=
q
i
)
;
b
(
z
=
q
i
,
x
=
v
j
)
=
∑
t
=
1
T
−
1
β
(
z
t
=
q
i
)
α
(
z
t
=
q
i
)
I
(
x
t
=
v
j
)
∑
t
=
1
T
β
(
z
t
=
q
i
)
α
(
z
t
=
q
i
)
\pi_i=\frac{\beta_{(z_1=q_i)}\alpha_{(z_1=q_i)}}{P(X|\theta^t)}; \\a_{(z=q_i,z=q_j)}=\frac{\sum\limits_{t=1}^{T-1}\beta_{(z_{t+1}=q_j)}b_{(z_{t+1}=q_j,x_{t+1})}a_{(z_t=q_i,z_{t+1}=q_j)}\alpha_{(z_t=q_i)}}{\sum\limits_{t=1}^{T-1}\beta_{(z_t=q_i)}\alpha_{(z_t=q_i)}}; \\b_{(z=q_i,x=v_j)}=\frac{\sum\limits_{t=1}^{T-1}\beta_{(z_t=q_i)}\alpha_{(z_t=q_i)}I(x_t=v_j)}{\sum\limits_{t=1}^T\beta_{(z_t=q_i)}\alpha_{(z_t=q_i)}}
πi=P(X∣θt)β(z1=qi)α(z1=qi);a(z=qi,z=qj)=t=1∑T−1β(zt=qi)α(zt=qi)t=1∑T−1β(zt+1=qj)b(zt+1=qj,xt+1)a(zt=qi,zt+1=qj)α(zt=qi);b(z=qi,x=vj)=t=1∑Tβ(zt=qi)α(zt=qi)t=1∑T−1β(zt=qi)α(zt=qi)I(xt=vj)
Decoding:
模型训练完成之后,我们求出 P ( Z ∣ X , θ ) P(Z|X,\theta) P(Z∣X,θ)以实现预测
即
Z
^
=
max
Z
P
(
Z
∣
X
,
θ
)
\hat Z=\max\limits_{Z}P(Z|X,\theta)
Z^=ZmaxP(Z∣X,θ)
传统方式方法的复杂度及其之高。故引入维特比算法
维特比算法(递推式算法)
所谓维特比算法。实际上就是采用了动态规划的思想。对于原式
Z
^
=
max
Z
P
(
Z
∣
X
,
θ
)
=
max
Z
P
(
Z
,
X
∣
θ
)
P
(
X
∣
θ
)
=
max
Z
P
(
Z
,
X
∣
θ
)
\begin{equation} \begin{aligned} \hat Z=&\max\limits_{Z}P(Z|X,\theta) \\=&\max_Z\frac{P(Z,X|\theta)}{P(X|\theta)} \\=&\max_ZP(Z,X|\theta) \end{aligned} \end{equation}
Z^===ZmaxP(Z∣X,θ)ZmaxP(X∣θ)P(Z,X∣θ)ZmaxP(Z,X∣θ)
设
ξ
z
t
(
q
i
)
=
max
z
1
,
z
2
,
⋯
,
z
t
−
1
P
(
z
1
,
z
2
,
⋯
,
z
t
=
q
i
,
x
1
,
x
2
,
⋯
,
x
t
∣
θ
)
\xi_{z_t}(q_i)=\max_{z_1,z_2,\cdots,z_{t-1}} P(z_1,z_2,\cdots,z_t=q_i,x_1,x_2,\cdots,x_t|\theta)
ξzt(qi)=z1,z2,⋯,zt−1maxP(z1,z2,⋯,zt=qi,x1,x2,⋯,xt∣θ)
其中
ξ
z
t
(
q
i
)
\xi_{z_t}(q_i)
ξzt(qi)表示处于第t个隐序列,状态为
q
i
q_i
qi的能够使得概率最大的概率。
找出其递推式,对于
ξ
z
t
+
1
(
q
i
)
\xi_{z_{t+1}}(q_i)
ξzt+1(qi)
ξ
z
t
+
1
(
q
i
)
=
max
j
∈
{
1
,
2
,
⋯
,
N
}
ξ
z
t
(
q
j
)
a
(
z
t
=
q
j
,
z
t
=
1
=
q
i
)
b
(
z
t
=
1
=
q
i
,
x
t
+
1
)
\begin{equation} \begin{aligned} \xi_{z_{t+1}}(q_i)=&\max_{j\in\{1,2,\cdots,N\}}\xi_{z_t}(q_j)a_{(z_t=q_j,z_{t=1}=q_i)}b_{(z_{t=1}=q_i,x_{t+1})} \end{aligned} \end{equation}
ξzt+1(qi)=j∈{1,2,⋯,N}maxξzt(qj)a(zt=qj,zt=1=qi)b(zt=1=qi,xt+1)
可理解寻找前一个使后面能够最大的前一个值。
那么,依照这种迭代式算法,我们只需要记录每一个能够使得 t + 1 t+1 t+1时刻最大的 t t t时刻所对应的 q i q_i qi即可
结束
这就是HMM隐马尔可夫模型的数学原理推导了,推导并不严谨,如有问题,还望指出。另外,对于隐马尔可夫模型的应用隐马尔可夫模型中文分词,感兴趣的可以看看,阿里嘎多。