高斯分布
输入数据:
X
=
(
x
1
,
x
2
,
.
.
.
,
x
n
)
T
=
(
x
1
T
x
2
T
.
.
.
x
n
T
)
X=(x_1,x_2,...,x_n)^T=\begin{pmatrix} x_1^T\\ x_2^T\\ ... \\ x_n^T\\ \end{pmatrix}
X=(x1,x2,...,xn)T=⎝⎜⎜⎛x1Tx2T...xnT⎠⎟⎟⎞
x
i
∈
R
p
,
x
i
∼
i
i
d
N
(
μ
,
Σ
)
,
θ
=
(
μ
,
Σ
)
x_i\in R^p,x_i\ \sim^{iid}\ N(μ,Σ),θ=(μ,Σ)
xi∈Rp,xi ∼iid N(μ,Σ),θ=(μ,Σ)
iid指独立同分布
回顾:
1.数学期望是对随机变量中心位置的一种度量,是每次实验中可能的结果乘以其结果的总和
E ( x ) = x f ( x ) E(x)=xf(x) E(x)=xf(x)
方差就是这种风险的度量,即随机变量的变异性
E ( x ) = ( x − μ ) f ( x ) E(x)=(x-μ)f(x) E(x)=(x−μ)f(x)
2.独立:一个事件的发生不依赖于另外一个事件,两个事件同时发生的概率为P(AB) = P(A)·P(B)
独立同分布:各事件相互独立,但满足同一个概率分布
M
L
E
:
θ
M
L
E
=
a
r
g
m
a
x
(
θ
)
P
(
X
∣
θ
)
MLE:θ_{MLE}=argmax_{(θ)}P(X|θ)
MLE:θMLE=argmax(θ)P(X∣θ)
令
p
=
1
,
θ
=
(
μ
,
σ
2
)
令p=1,θ=(μ,\sigma^2)
令p=1,θ=(μ,σ2)
P
(
x
)
=
1
2
π
σ
e
x
p
(
−
(
x
−
μ
)
2
2
σ
2
)
P
(
x
)
=
1
2
π
p
2
∣
∑
∣
1
2
e
x
p
(
−
1
2
(
x
−
μ
)
T
∑
−
1
(
x
−
μ
)
)
)
l
o
g
P
(
x
∣
θ
)
=
l
o
g
∏
i
=
1
N
p
(
x
i
∣
θ
)
=
∑
i
=
1
N
l
o
g
p
(
x
i
∣
θ
)
=
∑
i
=
1
N
l
o
g
1
2
π
σ
e
x
p
(
−
(
x
−
μ
)
2
2
σ
2
)
=
∑
i
=
1
N
[
l
o
g
1
2
π
+
l
o
g
1
σ
−
(
x
i
−
μ
)
2
2
σ
2
]
μ
M
L
E
=
a
r
g
m
a
x
μ
l
o
g
P
(
x
∣
θ
)
=
a
r
g
m
a
x
∑
i
=
1
N
−
x
i
−
μ
2
σ
2
=
a
r
g
m
a
x
u
m
i
n
∑
i
=
1
N
−
x
i
−
μ
2
σ
2
∂
∂
μ
∑
(
x
i
−
μ
)
2
=
∑
i
=
1
N
2
(
x
i
−
μ
)
(
−
1
)
=
0
∑
i
=
1
N
(
x
i
−
μ
)
=
0
\begin{array}{lcr} P(x) &=& \frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(x-μ)^2}{2\sigma^2})\\ P(x) &=& \frac{1}{\sqrt{2\pi}\frac{p}{2}|\sum|^{\frac{1}{2}}}exp(-\frac{1}{2}(x-μ)^T\sum^{-1}(x-μ)))\\\\ logP(x|θ) &=& log\prod_{i=1}^{N}p(x_i|θ)\\ &=& \sum_{i=1}^{N}logp(x_i|θ)\\ &=& \sum_{i=1}^{N}log\frac{1}{\sqrt{2\pi}\sigma}exp(-\frac{(x-μ)^2}{2\sigma^2})\\ &=& \sum_{i=1}^{N}[log\frac{1}{\sqrt{2\pi}}+log\frac{1}{\sigma}-\frac{(x_i-μ)^2}{2\sigma^2}]\\\\ μ_{MLE} &=& argmax_{μ} logP(x|θ)\\ &=& argmax\sum_{i=1}^{N}-\frac{x_i-μ}{2\sigma^2}\\ &=& argmax_{u}min\sum_{i=1}{N}-\frac{x_i-μ}{2\sigma^2}\\\\ \frac{\partial}{\partialμ}\sum(x_i-μ)^2 &=& \sum_{i=1}^{N}2(x_i-μ)(-1)=0\\ \sum_{i=1}^{N}(x_i-μ) &=& 0 \end{array}
P(x)P(x)logP(x∣θ)μMLE∂μ∂∑(xi−μ)2∑i=1N(xi−μ)===========2πσ1exp(−2σ2(x−μ)2)2π2p∣∑∣211exp(−21(x−μ)T∑−1(x−μ)))log∏i=1Np(xi∣θ)∑i=1Nlogp(xi∣θ)∑i=1Nlog2πσ1exp(−2σ2(x−μ)2)∑i=1N[log2π1+logσ1−2σ2(xi−μ)2]argmaxμlogP(x∣θ)argmax∑i=1N−2σ2xi−μargmaxumin∑i=1N−2σ2xi−μ∑i=1N2(xi−μ)(−1)=00
μ M L E = 1 N ∑ i = 1 N x i ( 无 偏 估 计 ) E [ μ M L E ] = 1 N ∑ i = 1 N E [ x i ] = 1 N ∑ i = 1 N μ = 1 N N μ = μ \begin{array}{lcr} μ_{MLE} &=& \frac{1}{N}\sum_{i=1}^{N}x_i(无偏估计)\\ E[μ_{MLE}] &=& \frac{1}{N}\sum_{i=1}^{N}E[x_i]\\ &=& \frac{1}{N}\sum_{i=1}^{N}μ\\ &=& \frac{1}{N}N\ μ\\ &=& μ \end{array} μMLEE[μMLE]=====N1∑i=1Nxi(无偏估计)N1∑i=1NE[xi]N1∑i=1NμN1N μμ
σ M L E 2 = a r g m a x σ P ( x ∣ σ ) = a r g m a x ∑ ( − l o g σ − 1 2 σ 2 ) ∂ ℘ ∂ σ = ∑ i = 1 N [ − 1 σ + 1 2 ( x i − μ ) 2 ( + 2 ) σ − 3 ] = 0 ∑ i = 1 N [ − 1 σ + ( x i − μ ) 2 σ − 3 ] = 0 − ∑ i = 1 N σ 2 + ∑ i = 1 N ( x i − μ ) 2 = 0 ∑ i = 1 N σ 2 = ∑ i = 1 N ( x i − μ ) 2 σ M L E 2 = 1 N ∑ i = 1 N ( x i − μ ) 2 ( 有 偏 估 计 ) E [ σ M L E 2 ] = N − 1 N σ 2 σ ^ = 1 N − 1 ∑ i = 1 N ( x i − μ M L E ) ( 无 偏 估 计 ) \begin{array}{lcr} \sigma_{MLE}^{2} &=& argmax_{\sigma}\ P(x|\sigma)\\ &=& argmax\sum(-log\ \sigma\ - \frac{1}{2\sigma^2})\\\\ \frac{\partial\wp}{\partial\sigma} &=& \sum_{i=1}^{N}[- \frac{1}{\sigma}+\frac{1}{2}(x_i-μ)^2(+2)\sigma^{-3}] &=& 0 \\ \sum_{i=1}^{N}[-\frac{1}{\sigma}+(x_i-μ)^2\sigma^{-3}] &=& 0\\ -\sum_{i=1}^{N}\sigma^2 \ + \ \sum_{i=1}^{N}(x_i-μ)^2 &=& 0\\ \sum_{i=1}^{N}\sigma^2 &=& \sum_{i=1}^{N}(x_i-μ)^2\\ \sigma_{MLE}^{2} &=& \frac{1}{N}\sum_{i=1}{N}(x_i-μ)^2(有偏估计)\\\\ E[\sigma_{MLE}^{2}] &=& \frac{N-1}{N}\sigma^2 \\ \hat\sigma &=& \frac{1}{N-1}\sum_{i=1}^{N}(x_i-μ_{MLE})(无偏估计) \end{array} σMLE2∂σ∂℘∑i=1N[−σ1+(xi−μ)2σ−3]−∑i=1Nσ2 + ∑i=1N(xi−μ)2∑i=1Nσ2σMLE2E[σMLE2]σ^=========argmaxσ P(x∣σ)argmax∑(−log σ −2σ21)∑i=1N[−σ1+21(xi−μ)2(+2)σ−3]00∑i=1N(xi−μ)2N1∑i=1N(xi−μ)2(有偏估计)NN−1σ2N−11∑i=1N(xi−μMLE)(无偏估计)=0