三硬币模型
已知有A,B,C三枚硬币,他们正面朝上的概率分别是 π , p , q 。现在给出规则:第一次扔A硬币,如果A正面朝上,则抛B硬币;如果A正面朝下,则抛C硬币。
给出观测结果:1,1,0,1,0,1 (1代表第二次所扔硬币正面朝上,否则正面朝下)
来估计三枚硬币正面朝上的概率 π,𝑝,𝑞。
P ( y ∣ θ ) = ∑ z P ( y , z ∣ θ ) = ∑ z P ( z ∣ θ ) P ( y ∣ z , θ ) = π p y ( 1 − p ) 1 − y + ( 1 − π ) q y ( 1 − q ) 1 − y \begin{align} P(y|\theta) &= \sum _z P(y,z|\theta)=\sum _z P(z|\theta)P(y|z,\theta) \\ &= \pi p^y(1-p)^{1-y}+(1-\pi) q^y(1-q)^{1-y} \end{align} P(y∣θ)=z∑P(y,z∣θ)=z∑P(z∣θ)P(y∣z,θ)=πpy(1−p)1−y+(1−π)qy(1−q)1−y
其中,y是观测变量,z表示A的结果(不可观测)。
θ
=
(
π
,
p
,
q
)
是参数。
\theta = (π , p , q)是参数 。
θ=(π,p,q)是参数。
其中Y=(Y1,…,Yn)观测数据的似然函数为:
P
(
Y
∣
θ
)
=
∏
j
=
i
n
π
p
y
j
(
1
−
p
)
1
−
y
j
+
(
1
−
π
)
q
j
y
(
1
−
q
)
1
−
y
j
P(Y|\theta) = \prod_{j=i}^n\pi p^{y_j}(1-p)^{1-y_j}+(1-\pi) q^y_j(1-q)^{1-y_j}
P(Y∣θ)=j=i∏nπpyj(1−p)1−yj+(1−π)qjy(1−q)1−yj
参数的极大似然估计为:
θ
^
=
a
r
g
m
a
x
θ
l
o
g
P
(
Y
∣
θ
)
\hat \theta = arg \underset {\theta}{max} log P(Y|\theta)
θ^=argθmaxlogP(Y∣θ)
EM算法
-
- 计算硬币B的概率
μ j i + 1 = π i ( p i ) y ( 1 − p i ) 1 − y π i ( p i ) y ( 1 − p i ) 1 − y + ( 1 − π i ) ( q i ) y ( 1 − q i ) 1 − y \mu_j^{i+1} = \frac{\pi ^i (p^i)^y(1-p^i )^{1-y}}{\pi ^i (p^i )^y(1-p^i )^{1-y}+(1-\pi ^i ) (q^i )^y(1-q^i )^{1-y}} μji+1=πi(pi)y(1−pi)1−y+(1−πi)(qi)y(1−qi)1−yπi(pi)y(1−pi)1−y
- 计算硬币B的概率
-
- 更新参数估计值
π i + 1 = a v g ( μ i + 1 ) \pi ^{i+1} = avg(\mu^{i+1}) πi+1=avg(μi+1)
- 更新参数估计值
p i + 1 = a v g ( μ i + 1 y ) a v g ( μ i + 1 ) p ^{i+1} = \frac{avg(\mu^{i+1}y)}{avg(\mu^{i+1})} pi+1=avg(μi+1)avg(μi+1y)
q i + 1 = ( a v g ( 1 − μ i + 1 ) y ) a v g ( 1 − μ i + 1 ) q ^{i+1} = \frac{(avg(1-\mu^{i+1})y)}{avg(1-\mu^{i+1})} qi+1=avg(1−μi+1)(avg(1−μi+1)y)
- 3.重复1,2直至参数收敛
class Em3Coins:
def __init__(self, pi, p, q):
self.pi = pi
self.p = p
self.q = q
def update(self, y):
u_i_new = self.pi*self.p**y*(1-self.p)**(1-y)/(
self.pi*self.p**y*(1-self.p)**(1-y) +
(1-self.pi)*self.q**y*(1-self.q)**(1-y))
self.pi = np.mean(u_i_new)
self.p = u_i_new.dot(y)/np.sum(u_i_new)
self.q = (1-u_i_new).dot(y)/np.sum(u_i_new)
def train(self, y, epochs=20):
print("*"*10, "Training", "*"*10)
for epoch in range(epochs):
self.update(y)
print("="*10, "epoch: ", epoch+1, "="*10)
print("π = %.2f"%self.pi,
"p =%.2f"%self.p,
"q =%.2f"%self.q)
print("*"*10, "Train Done", "*"*10)
print("π = %.2f"%self.pi,
"p =%.2f"%self.p,
"q =%.2f"%self.q)
运行
import numpy as np
y = np.array([1,1,0,1,0,0,1,0,1,1])
# 初始化概率
em = Em3Coins(0.4,0.6,0.7)
em.train(y)
'''
输出
********** Train Done **********
π = 0.50 p =0.34 q =0.86
'''
ref
[1]李航. 统计学习方法[M]. 清华大学出版社, 2012.