多元随机变量
(略)
多元正态分布
定义1:
X
=
A
U
+
μ
∼
N
p
(
μ
,
A
A
′
)
X=AU+\mu \sim N_p(\mu,AA')
X=AU+μ∼Np(μ,AA′)
其中
U
=
(
U
)
p
×
1
U=(U)_{p \times1}
U=(U)p×1,
U
i
∼
N
(
0
,
1
)
U_i\sim N(0,1)
Ui∼N(0,1)
性质:
- 特征函数
- 若 X ∼ N p ( μ , Σ ) X \sim N_p(\mu,\Sigma) X∼Np(μ,Σ),令 Z = B X + d Z=BX+d Z=BX+d,则 Z ∼ N S ( B μ + d , B Σ B ′ ) Z \sim N_S(B\mu +d,B\Sigma B') Z∼NS(Bμ+d,BΣB′),即正态分布的线性组合(包括边缘分布)仍是正态分布
- μ , Σ \mu,\Sigma μ,Σ分别是正态分布的期望与方差
- X ∼ N p ( μ , Σ ) ↔ ξ 1 × 1 = a 1 × p X p × 1 ∼ N 1 X\sim N_p(\mu,\Sigma) \leftrightarrow \xi_{1\times 1}=a_{1\times p}X{p\times 1} \sim N_1 X∼Np(μ,Σ)↔ξ1×1=a1×pXp×1∼N1
定义2:
p p p维随机向量 X X X的任意线性组合均服从一元正态分布,则称 X X X为 p p p维正态随机向量
性质:
- 设
X
∼
N
p
(
μ
,
Σ
)
X \sim N_p(\mu,\Sigma)
X∼Np(μ,Σ)且
Σ
\Sigma
Σ正定,则
X
X
X的联合密度函数
f ( X ) = 1 ( 2 π ) p 2 ∣ Σ ∣ 1 2 e x p [ − 1 2 ( X − μ ) ′ Σ − 1 ( X − μ ) ] f(X)=\frac{1}{(2\pi)^\frac{p}{2}\left| \Sigma \right|^\frac{1}{2}}exp[-\frac{1}{2}(X-\mu)'\Sigma^{-1}(X-\mu)] f(X)=(2π)2p∣Σ∣211exp[−21(X−μ)′Σ−1(X−μ)]
证明:
∵ X = A U + d f U ( U ) = 1 ( 2 π ) p 2 e x p [ − 1 2 U ′ U ] f X ( X ) = f U J ( U → X ) 又 ∵ J ( U → X ) = 1 J ( X → U ) = ∣ Σ ∣ − 1 2 ∴ 得 证 \because X =AU+d\\ f_U(U)=\frac{1}{(2\pi)^\frac{p}{2}}exp[-\frac{1}{2}U'U]\\ f_X(X)=f_UJ(U\rightarrow X)\\ 又\because J(U\rightarrow X)=\frac{1}{J(X\rightarrow U)}=\left| \Sigma\right|^{-\frac{1}{2}}\\ \therefore 得证 ∵X=AU+dfU(U)=(2π)2p1exp[−21U′U]fX(X)=fUJ(U→X)又∵J(U→X)=J(X→U)1=∣Σ∣−21∴得证
定义3:
若 X X X的联合密度函数如定义2所示,则 X X X是 p p p维正态随机向量
二元正态:
设 X = [ X 1 X 2 ] X=\begin{bmatrix}X_1\\X_2\end{bmatrix} X=[X1X2],且 X ∼ N 2 ( μ , Σ ) X\sim N_2(\mu,\Sigma) X∼N2(μ,Σ)
ρ \rho ρ的统计意义:
记
μ
=
[
μ
1
μ
2
]
,
Σ
=
[
σ
11
σ
12
σ
21
σ
22
]
=
[
σ
1
2
ρ
σ
1
σ
2
ρ
σ
1
σ
2
σ
2
2
]
\mu=\begin{bmatrix}\mu_1\\\mu_2\end{bmatrix},\Sigma=\begin{bmatrix}\sigma_{11}&\sigma_{12}\\\sigma_{21}&\sigma_{22}\end{bmatrix}=\begin{bmatrix}\sigma_1^2&\rho\sigma_1\sigma_2\\\rho\sigma_1\sigma_2&\sigma_2^2\end{bmatrix}
μ=[μ1μ2],Σ=[σ11σ21σ12σ22]=[σ12ρσ1σ2ρσ1σ2σ22]
则有
X
1
∼
N
(
μ
1
,
σ
1
2
)
,
X
2
∼
N
(
μ
2
,
σ
2
2
)
X_1 \sim N(\mu_1,\sigma_1^2),X_2 \sim N(\mu_2,\sigma_2^2)
X1∼N(μ1,σ12),X2∼N(μ2,σ22)
且
ρ
(
X
1
,
X
2
)
=
C
o
v
(
X
1
,
X
2
)
V
a
r
(
X
1
)
V
a
r
(
X
2
)
=
ρ
\rho(X_1,X_2)=\frac{Cov(X_1,X_2)}{\sqrt{Var(X_1)}\sqrt{Var(X_2)}}=\rho
ρ(X1,X2)=Var(X1)Var(X2)Cov(X1,X2)=ρ
∴
ρ
\therefore \rho
∴ρ就是相关系数
- ρ = 0 \rho=0 ρ=0时,显然独立
- ∣ ρ ∣ = 1 |\rho |=1 ∣ρ∣=1时 ∣ Σ ∣ = 0 |\Sigma|=0 ∣Σ∣=0,即 Σ x = 0 \Sigma x=0 Σx=0有非零解,可推出 X 1 X 2 X_1X_2 X1X2线性相关
矩阵正态分布:
定义:
设
X
(
i
)
=
[
X
i
1
X
i
2
.
.
.
X
i
p
]
X_{(i)}=\begin{bmatrix}X_{i1}\\X_{i2}\\...\\X_{ip}\end{bmatrix}
X(i)=⎣⎢⎢⎡Xi1Xi2...Xip⎦⎥⎥⎤为来自
N
p
(
μ
,
Σ
)
N_p(\mu,\Sigma)
Np(μ,Σ)的随机样本,观测
n
n
n次,得到
X
n
×
p
X_{n\times p}
Xn×p,将
X
X
X按行拉直得到
V
e
c
(
X
′
)
Vec(X')
Vec(X′),
若
V
e
c
(
X
′
)
∼
N
n
p
(
1
n
⨂
μ
,
I
n
⨂
Σ
)
Vec(X') \sim N_{np}(\mathbf{1}_n\bigotimes \mu,I_n\bigotimes \Sigma)
Vec(X′)∼Nnp(1n⨂μ,In⨂Σ),则称
X
X
X服从矩阵正态分布,一般记作
X
∼
N
n
×
p
(
M
,
I
n
⨂
Σ
)
X\sim N_{n\times p}(M,I_n\bigotimes \Sigma)
X∼Nn×p(M,In⨂Σ),
其中
V
e
c
(
M
′
)
=
1
n
⨂
μ
Vec(M')=\mathbf{1}_n\bigotimes \mu
Vec(M′)=1n⨂μ,即
M
=
1
n
μ
′
M=\mathbf{1}_n\mu'
M=1nμ′
⨂
\bigotimes
⨂为克罗内克积
矩阵正态分布的性质:
设
X
∼
N
n
×
p
(
M
,
I
n
⨂
Σ
)
X\sim N_{n\times p}(M,I_n\bigotimes \Sigma)
X∼Nn×p(M,In⨂Σ),
A
A
A为
k
×
n
k\times n
k×n常数矩阵,
B
B
B为
q
×
p
q\times p
q×p常数矩阵,
D
D
D为
k
×
q
k\times q
k×q常数矩阵,令
Z
=
A
X
B
′
+
D
Z=AXB'+D
Z=AXB′+D,则:
Z
∼
N
k
×
q
(
A
M
B
′
+
D
,
(
A
A
′
)
⨂
(
B
Σ
B
′
)
)
Z\sim N_{k\times q}(AMB'+D,(AA')\bigotimes(B\Sigma B'))
Z∼Nk×q(AMB′+D,(AA′)⨂(BΣB′))
也就是说,对多维正态样本进行线性组合,得到的新样本服从矩阵正态分布,即新的总体仍是正态总体。
条件分布与独立性:
X p X_p Xp的分块:
设 X p = [ X ( 1 ) X ( 2 ) ] ∼ N p ( [ μ ( 1 ) μ ( 2 ) ] , [ Σ 11 Σ 12 Σ 21 Σ 22 ] ) X_p=\begin{bmatrix}X^{(1)}\\ X^{(2)} \end{bmatrix}\sim N_p(\begin{bmatrix}\mu^{(1)}\\\mu^{(2)}\end{bmatrix}, \begin{bmatrix}\Sigma_{11}&\Sigma_{12}\\\Sigma_{21}&\Sigma_{22}\end{bmatrix}) Xp=[X(1)X(2)]∼Np([μ(1)μ(2)],[Σ11Σ21Σ12Σ22])
分块的独立性:
相互独立
↔
Σ
12
=
Σ
21
=
O
\leftrightarrow \Sigma_{12}=\Sigma_{21}=O
↔Σ12=Σ21=O(类似二元正态)
推论:对于划分为
k
k
k个分量的
p
p
p维正态向量而言,各分量相互独立
↔
Σ
\leftrightarrow\Sigma
↔Σ是对角分块阵
条件分布:
定义:
F ( X ( 1 ) ∣ X ( 2 ) ) = f ( X ( 1 ) , X ( 2 ) ) f ( X ( 2 ) ) F(X^{(1)}|X^{(2)})=\frac{f(X^{(1)},X^{(2)})}{f(X^{(2)})} F(X(1)∣X(2))=f(X(2))f(X(1),X(2))
协差阵的逆:
Σ
−
1
=
[
Σ
11
⋅
2
−
1
−
Σ
11
⋅
2
−
1
Σ
12
Σ
22
−
1
−
Σ
22
−
1
Σ
21
Σ
11
⋅
2
−
1
Σ
22
−
1
+
Σ
22
−
1
Σ
21
Σ
11
⋅
2
−
1
Σ
12
Σ
−
1
22
]
\Sigma^{-1}=\begin{bmatrix}\Sigma_{11\cdot2}^{-1} & -\Sigma_{11\cdot2}^{-1}\Sigma_{12}^{}\Sigma_{22}^{-1}\\-\Sigma_{22}^{-1}\Sigma_{21}^{}\Sigma_{11\cdot2}^{-1} & \Sigma_{22}^{-1}+\Sigma_{22}^{-1}\Sigma_{21}^{}\Sigma_{11\cdot2}^{-1}\Sigma_{12}^{}\Sigma_{-1}^{22}\end{bmatrix}
Σ−1=[Σ11⋅2−1−Σ22−1Σ21Σ11⋅2−1−Σ11⋅2−1Σ12Σ22−1Σ22−1+Σ22−1Σ21Σ11⋅2−1Σ12Σ−122]
其中,
Σ
11
⋅
2
=
Σ
11
−
Σ
12
Σ
22
−
1
Σ
21
\Sigma_{11\cdot2}^{}=\Sigma_{11}^{}-\Sigma_{12}^{}\Sigma_{22}^{-1}\Sigma_{21}^{}
Σ11⋅2=Σ11−Σ12Σ22−1Σ21
∴
\therefore
∴设
X
=
[
X
r
(
1
)
X
p
−
r
(
2
)
]
∼
N
p
(
μ
,
Σ
)
X=\begin{bmatrix}X^{(1)}_r\\X^{(2)}_{p-r}\end{bmatrix}\sim N_p(\mu,\Sigma)
X=[Xr(1)Xp−r(2)]∼Np(μ,Σ)
则
(
X
(
1
)
∣
X
(
2
)
)
∼
N
r
(
μ
1
⋅
2
,
Σ
11
⋅
2
)
(X^{(1)}|X^{(2)})\sim N_r(\mu_{1\cdot2},\Sigma_{11\cdot2})
(X(1)∣X(2))∼Nr(μ1⋅2,Σ11⋅2)
其中
μ
1
⋅
2
,
=
μ
(
1
)
+
Σ
12
Σ
22
−
1
(
X
(
2
)
−
μ
(
2
)
)
\mu_{1\cdot2},=\mu^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}(X^{(2)}-\mu^{(2)})
μ1⋅2,=μ(1)+Σ12Σ22−1(X(2)−μ(2))
Σ
11
⋅
2
=
Σ
11
−
Σ
12
Σ
22
−
1
Σ
21
\Sigma_{11\cdot2}^{}=\Sigma_{11}^{}-\Sigma_{12}^{}\Sigma_{22}^{-1}\Sigma_{21}^{}
Σ11⋅2=Σ11−Σ12Σ22−1Σ21
推论:
- X ( 1 ) X^{(1)} X(1)与 X ( 1 ) − Σ 12 Σ 22 − 1 X ( 2 ) X^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}X^{(2)} X(1)−Σ12Σ22−1X(2)相互独立,称 B = Σ 12 Σ 22 − 1 B=\Sigma_{12}\Sigma_{22}^{-1} B=Σ12Σ22−1为回归系数
-
X
(
2
)
X^{(2)}
X(2)与
X
(
1
)
−
Σ
21
Σ
11
−
1
X
(
1
)
X^{(1)}-\Sigma_{21}\Sigma_{11}^{-1}X^{(1)}
X(1)−Σ21Σ11−1X(1)相互独立
证明:直接通过分块矩阵的性质构造两者的协方差阵 - 设 Z = [ X p Y 1 ] ∼ N p + 1 ( [ μ x μ y ] , [ Σ x x Σ x y Σ y z Σ y y ] ) Z=\begin{bmatrix}X_p\\Y_1\end{bmatrix}\sim N_{p+1}(\begin{bmatrix}\mu_x\\\mu_y\end{bmatrix},\begin{bmatrix}\Sigma_{xx}&\Sigma_{xy}\\\Sigma_{yz}&\Sigma_{yy}\end{bmatrix}) Z=[XpY1]∼Np+1([μxμy],[ΣxxΣyzΣxyΣyy]),记 g ( X ) = E ( Y ∣ X ) g(X)=E(Y|X) g(X)=E(Y∣X),则对任意函数 φ ( ⋅ ) \varphi(\cdot) φ(⋅)有 E [ ( Y − g ( X ) ) 2 ] ≤ E [ ( Y − φ ( X ) ) 2 ] E[(Y-g(X))^2]\leq E[(Y-\varphi(X))^2] E[(Y−g(X))2]≤E[(Y−φ(X))2],即,若以均方差最小为准则,条件期望是对 Y Y Y的最佳预测。
参数估计:
随机样本阵
X
=
[
X
11
.
.
.
X
1
p
.
.
.
.
.
.
X
n
1
.
.
.
X
n
p
]
=
[
X
(
1
)
.
.
.
X
(
n
)
]
′
X=\begin{bmatrix}X_{11}&...& X_{1p}\\...&&...\\X_{n1}&...&X_{np}\end{bmatrix}=\begin{bmatrix}X_{(1)}&...&X_{(n)}\end{bmatrix}'
X=⎣⎡X11...Xn1......X1p...Xnp⎦⎤=[X(1)...X(n)]′
其中
X
(
i
)
X_{(i)}
X(i)为简单随机样本,即每一行表示一个样本
样本统计量:
- 均值: X ‾ = 1 n ∑ 1 n X ( i ) = [ X ‾ 1 . . . X ‾ p ] ′ = 1 n X ′ 1 n \overline{X}=\frac{1}{n}\sum_1^n\limits X_{(i)}=\begin{bmatrix}\overline{X}_1...\overline{X}_p\end{bmatrix}'=\frac{1}{n}X'\mathbf{1}_n X=n11∑nX(i)=[X1...Xp]′=n1X′1n
- 离差:
A
=
∑
α
=
1
n
(
X
(
α
)
−
X
‾
)
(
X
(
α
)
−
X
‾
)
′
=
X
′
X
−
n
X
‾
X
‾
′
A=\sum_{\alpha=1}^{n}\limits(X_{(\alpha)}-\overline{X})(X_{(\alpha)}-\overline{X})'=X'X-n\overline{X}\overline{X}'
A=α=1∑n(X(α)−X)(X(α)−X)′=X′X−nXX′,
令 S = ( a i j ) p × p S=(a_{ij})_{p\times p} S=(aij)p×p,其中 a i j = ∑ α = 1 n ( X α i − X ‾ i ) ( X α j − X ‾ j ) a_{ij}=\sum_{\alpha=1}^n\limits(X_{\alpha i}-\overline{X}_i)(X_{\alpha j}-\overline{X}_j) aij=α=1∑n(Xαi−Xi)(Xαj−Xj), X ‾ i \overline{X}_i Xi为第 i i i个变量的均值 - 协方差阵: S = 1 n − 1 A = ( s i j ) p × p S=\frac{1}{n-1}A=(s_{ij})_{p\times p} S=n−11A=(sij)p×p, S ∗ = 1 n A S^*=\frac{1}{n}A S∗=n1A
- 样本相关阵: R = ( r i j ) p × p , r i j = s i j s i i s j j = a i j a i i a j j R=(r_{ij})_{p\times p},r_{ij}=\frac{s_{ij}}{\sqrt{s_{ii}}\sqrt{s_{jj}}}=\frac{a_{ij}}{\sqrt{a_{ii}}\sqrt{a_{jj}}} R=(rij)p×p,rij=siisjjsij=aiiajjaij
极大似然估计:
极大似然函数:
L
(
μ
,
Σ
)
=
∏
1
n
1
(
2
π
)
p
2
∣
Σ
∣
1
2
e
x
p
[
−
1
2
(
X
(
i
)
−
μ
)
′
Σ
−
1
(
X
(
i
)
−
μ
)
]
L(\mu,\Sigma)=\prod_1^n\limits\frac{1}{(2\pi)^\frac{p}{2}|\Sigma|^\frac{1}{2}}exp[-\frac{1}{2}(X_{(i)}-\mu)'\Sigma^{-1}(X_{(i)}-\mu)]
L(μ,Σ)=1∏n(2π)2p∣Σ∣211exp[−21(X(i)−μ)′Σ−1(X(i)−μ)]
可化为
1
(
2
π
)
n
p
2
∣
Σ
∣
n
2
e
x
p
[
t
r
(
−
1
2
Σ
−
1
∑
1
n
[
(
X
(
i
)
−
μ
)
(
X
(
i
)
−
μ
)
′
]
)
]
\frac{1}{(2\pi)^\frac{np}{2}|\Sigma|^\frac{n}{2}}exp[tr(-\frac{1}{2}\Sigma^{-1}\sum_1^n\limits[(X_{(i)}-\mu)(X_{(i)}-\mu)'])]
(2π)2np∣Σ∣2n1exp[tr(−21Σ−11∑n[(X(i)−μ)(X(i)−μ)′])]
其中
∑
1
n
(
X
(
i
)
−
μ
)
(
X
(
i
)
−
μ
)
′
=
∑
1
n
[
(
X
(
i
)
−
X
‾
+
X
‾
−
μ
)
(
X
(
i
)
−
X
‾
+
X
‾
−
μ
)
′
]
=
∑
1
n
(
X
(
i
)
−
X
‾
)
(
X
(
i
)
−
X
‾
)
′
+
n
(
X
(
i
)
−
μ
)
(
X
(
i
)
−
μ
)
′
=
A
+
n
(
X
(
i
)
−
μ
)
(
X
(
i
)
−
μ
)
′
∵
∑
α
=
1
n
(
X
α
i
−
X
‾
α
)
(
X
‾
α
−
μ
α
)
=
0
∴
(
X
i
−
X
‾
)
′
(
X
‾
−
μ
)
=
0
∴
(
X
i
−
X
‾
)
(
X
‾
−
μ
)
′
=
O
\begin{aligned} &\sum_1^n\limits(X_{(i)}-\mu)(X_{(i)}-\mu)'\\ =&\sum_1^n\limits[(X_{(i)}-\overline{X}+\overline{X}-\mu)(X_{(i)}-\overline{X}+\overline{X}-\mu)']\\ =&\sum_1^n\limits(X_{(i)}-\overline{X})(X_{(i)}-\overline{X})'+n(X_{(i)}-\mu)(X_{(i)}-\mu)'\\ =&A+n(X_{(i)}-\mu)(X_{(i)}-\mu)'\\ \because& \sum_{\alpha=1}^n(X_{\alpha i}-\overline{X}_\alpha)(\overline{X}_\alpha-\mu_\alpha)=0\\ \therefore&(X_{ i}-\overline{X})'(\overline{X}-\mu)=0\\ \therefore&(X_{ i}-\overline{X})(\overline{X}-\mu)'=O \end{aligned}
= = = ∵ ∴ ∴1∑n(X(i)−μ)(X(i)−μ)′1∑n[(X(i)−X+X−μ)(X(i)−X+X−μ)′]1∑n(X(i)−X)(X(i)−X)′+n(X(i)−μ)(X(i)−μ)′A+n(X(i)−μ)(X(i)−μ)′α=1∑n(Xαi−Xα)(Xα−μα)=0(Xi−X)′(X−μ)=0(Xi−X)(X−μ)′=O
再运用迹的四则运算律可得上式
极大似然对数函数:
l n L ( μ , Σ ) = − l n [ ( 2 π ) n p 2 ∣ Σ ∣ n 2 ] − 1 2 t r Σ − 1 ∑ 1 n [ ( X ( i ) − μ ) ( X ( i ) − μ ) ′ ] ) = − l n [ ( 2 π ) n p 2 ∣ Σ ∣ n 2 ] − 1 2 t r Σ − 1 [ A + n ( X ( i ) − μ ) ( X ( i ) − μ ) ′ ] \begin{aligned} lnL(\mu,\Sigma)&=-ln[(2\pi)^\frac{np}{2}|\Sigma|^\frac{n}{2}]-\frac{1}{2}tr\Sigma^{-1}\sum_1^n\limits[(X_{(i)}-\mu)(X_{(i)}-\mu)'])\\ &=-ln[(2\pi)^\frac{np}{2}|\Sigma|^\frac{n}{2}]-\frac{1}{2}tr\Sigma^{-1}[A+n(X_{(i)}-\mu)(X_{(i)}-\mu)'] \end{aligned} lnL(μ,Σ)=−ln[(2π)2np∣Σ∣2n]−21trΣ−11∑n[(X(i)−μ)(X(i)−μ)′])=−ln[(2π)2np∣Σ∣2n]−21trΣ−1[A+n(X(i)−μ)(X(i)−μ)′]
求解 μ \mu μ的极大似然估计:
对
μ
\mu
μ而言,
l
n
L
(
μ
,
Σ
)
=
C
−
1
2
t
r
Σ
−
1
[
A
+
n
(
X
(
i
)
−
μ
)
(
X
(
i
)
−
μ
)
′
]
lnL(\mu,\Sigma)=C-\frac{1}{2}tr\Sigma^{-1}[A+n(X_{(i)}-\mu)(X_{(i)}-\mu)']
lnL(μ,Σ)=C−21trΣ−1[A+n(X(i)−μ)(X(i)−μ)′]
当
n
2
t
r
[
Σ
−
1
(
X
(
i
)
−
μ
)
(
X
(
i
)
−
μ
)
′
]
\frac{n}{2}tr[\Sigma^{-1}(X_{(i)}-\mu)(X_{(i)}-\mu)']
2ntr[Σ−1(X(i)−μ)(X(i)−μ)′]
即
n
2
[
(
X
(
i
)
−
μ
)
′
Σ
−
1
(
X
(
i
)
−
μ
)
]
\frac{n}{2}[(X_{(i)}-\mu)'\Sigma^{-1}(X_{(i)}-\mu)]
2n[(X(i)−μ)′Σ−1(X(i)−μ)]最小时,似然函数取最大
由于
Σ
\Sigma
Σ正定,所以最小值为
0
0
0,此时
μ
=
X
‾
\mu=\overline{X}
μ=X
求解 Σ \Sigma Σ的极大似然估计:
引理: B B B为 p p p阶正定阵,则 t r B − l n B ≥ p trB-lnB\ge p trB−lnB≥p,当且仅当 B = I p B=I_p B=Ip时等号成立
对
Σ
\Sigma
Σ来说,
n
2
l
n
∣
Σ
∣
+
1
2
t
r
(
Σ
−
1
A
)
\frac{n}{2}ln|\Sigma|+\frac{1}{2}tr(\Sigma^{-1}A)
2nln∣Σ∣+21tr(Σ−1A)最小时,似然函数最大。
n
2
l
n
∣
Σ
∣
+
1
2
t
r
(
Σ
−
1
A
)
=
n
2
[
l
n
∣
Σ
∣
+
t
r
(
Σ
−
1
A
n
)
]
=
n
2
[
−
l
n
∣
Σ
−
1
A
n
∣
+
l
n
∣
A
n
∣
+
t
r
(
Σ
−
1
A
n
)
]
=
n
2
[
l
n
∣
A
n
∣
+
t
r
(
Σ
−
1
A
n
)
−
l
n
∣
Σ
−
1
A
n
∣
]
≥
n
2
[
l
n
∣
A
n
∣
+
p
]
\begin{aligned} &\frac{n}{2}ln|\Sigma|+\frac{1}{2}tr(\Sigma^{-1}A)\\ =&\frac{n}{2}[ln|\Sigma|+tr(\Sigma^{-1}\frac{A}{n})]\\ =&\frac{n}{2}[-ln|\Sigma^{-1}\frac{A}{n}|+ln|\frac{A}{n}|+tr(\Sigma^{-1}\frac{A}{n})]\\ =&\frac{n}{2}[ln|\frac{A}{n}|+tr(\Sigma^{-1}\frac{A}{n})-ln|\Sigma^{-1}\frac{A}{n}|]\\ \ge&\frac{n}{2}[ln|\frac{A}{n}|+p]\\ \end{aligned}
===≥2nln∣Σ∣+21tr(Σ−1A)2n[ln∣Σ∣+tr(Σ−1nA)]2n[−ln∣Σ−1nA∣+ln∣nA∣+tr(Σ−1nA)]2n[ln∣nA∣+tr(Σ−1nA)−ln∣Σ−1nA∣]2n[ln∣nA∣+p]
此时
Σ
−
1
A
n
=
I
p
\Sigma^{-1}\frac{A}{n}=I_p
Σ−1nA=Ip,即
Σ
=
A
n
\Sigma=\frac{A}{n}
Σ=nA
∴
(
μ
^
,
Σ
^
)
=
(
X
‾
,
A
n
)
\therefore (\hat{\mu},\hat{\Sigma})=(\overline{X},\frac{A}{n})
∴(μ^,Σ^)=(X,nA)
极大似然估计的性质:
重要定理:
设 X ‾ \overline{X} X和 A A A分别是 p p p元正态总体的样本均值和样本离差阵,则有:
- X ‾ ∼ N p ( μ , 1 n Σ ) \overline{X}\sim N_p(\mu,\frac{1}{n}\Sigma) X∼Np(μ,n1Σ)
- A = ∑ 1 n − 1 Z i Z i ′ A=\sum_1^{n-1}\limits Z_iZ_i' A=1∑n−1ZiZi′,其中 Z i Z_i Zi独立同 N p ( 0 , Σ ) N_p(0,\Sigma) Np(0,Σ)分布
- X ‾ \overline{X} X和 A A A相互独立
- P { A > 0 } = 1 ↔ n > p P\{A>0\}=1\leftrightarrow n>p P{A>0}=1↔n>p
无偏性:
可以证明 X ‾ \overline{X} X的各分量期望无偏,根据定理(2),可将 E ( A ) E(A) E(A)化为 D ( Z i ) D(Z_i) D(Zi)的求和
有效性:
可以证明, X ‾ , A \overline{X},A X,A是“最小方差”估计
相合性:
由强大数定律可证,当 n → ∞ n\rightarrow \infin n→∞时, X ‾ , A \overline{X},A X,A是强相合估计
参数函数的极大似然估计:
定义:
设参数向量
θ
\theta
θ的变化范围是
Θ
∈
ℜ
k
\Theta\in \real^k
Θ∈ℜk,
L
(
θ
)
L(\theta)
L(θ)是似然函数,设
ω
=
g
(
θ
)
\omega=g(\theta)
ω=g(θ)是
Θ
\Theta
Θ到
Θ
∗
\Theta^*
Θ∗上的Borel可测映射,其中
Θ
∗
⊆
ℜ
k
\Theta^*\subseteq\real^k
Θ∗⊆ℜk,则对任意
ω
∈
Θ
∗
\omega\in\Theta^*
ω∈Θ∗,令
M
(
ω
)
=
sup
θ
:
G
(
θ
)
=
ω
L
(
θ
)
M(\omega)=\sup_{\theta:G(\theta)=\omega}\limits L(\theta)
M(ω)=θ:G(θ)=ωsupL(θ)
则称
M
(
ω
)
M(\omega)
M(ω)为函数
g
(
θ
)
g(\theta)
g(θ)诱导出的似然函数
若
ω
^
\hat{\omega}
ω^满足
M
(
ω
^
)
=
sup
ω
M
(
ω
)
M(\hat{\omega})=\sup_\omega\limits M(\omega)
M(ω^)=ωsupM(ω),则称
ω
^
\hat{\omega}
ω^是
g
(
θ
)
g(\theta)
g(θ)的极大似然估计
由此得到定理:若
θ
^
\hat\theta
θ^是
θ
\theta
θ的极大似然估计,则
ω
^
=
g
(
θ
^
)
\hat{\omega}=g(\hat\theta)
ω^=g(θ^)是
g
(
θ
)
g(\theta)
g(θ)的极大似然估计