Problem
设 M A ( 1 ) MA(1) MA(1)模型为: y t = θ 0 + θ 1 ϵ t − 1 + ϵ t y_t=\theta_0+\theta_1\epsilon_{t-1}+\epsilon_t yt=θ0+θ1ϵt−1+ϵt,其中 ϵ t ∼ i . i . d . N ( 0 , σ ϵ 2 ) \epsilon_t\mathop{\sim}\limits^{i.i.d.} N(0,\sigma_\epsilon^2) ϵt∼i.i.d.N(0,σϵ2)
极大似然估计
假设观测数据集为
{
y
1
,
…
,
y
T
}
\{y_1,\ldots,y_T\}
{y1,…,yT},令
θ
=
(
θ
0
,
θ
1
,
σ
ϵ
2
)
′
\theta=(\theta_0,\theta_1,\sigma_\epsilon^2)^\prime
θ=(θ0,θ1,σϵ2)′,
y
=
(
y
1
,
…
,
y
T
)
′
y=(y_1,\ldots,y_T)^\prime
y=(y1,…,yT)′,
我们有
(
y
1
,
…
,
y
T
)
∼
N
T
(
μ
,
Σ
)
(y_1,\ldots,y_T)\sim N_T(\mu,\Sigma)
(y1,…,yT)∼NT(μ,Σ)
在精确似然估计中我们假设 ( y 1 , … , y T ) (y_1,\ldots,y_T) (y1,…,yT)服从多元正态分布
μ = ( θ 0 , … , θ 0 ) ′ \mu=(\theta_0,\ldots,\theta_0)^\prime μ=(θ0,…,θ0)′
E y t = E θ 0 + E θ 1 ϵ t − 1 + E ϵ t = θ 0 Ey_t=E\theta_0+E\theta_1\epsilon_{t-1}+E\epsilon_t=\theta_0 Eyt=Eθ0+Eθ1ϵt−1+Eϵt=θ0
Σ = ( C o v ( y 1 , y 1 ) C o v ( y 1 , y 2 ) C o v ( y 1 , y 3 ) … C o v ( y 1 , y T ) C o v ( y 2 , y 1 ) C o v ( y 2 , y 2 ) C o v ( y 2 , y 3 ) … C o v ( y 2 , y T ) C o v ( y 3 , y 1 ) C o v ( y 3 , y 2 ) C o v ( y 3 , y 3 ) … C o v ( y 3 , y T ) ⋮ ⋮ ⋮ ⋱ ⋮ C o v ( y T , y 1 ) C o v ( y T , y 2 ) C o v ( y T , y 3 ) … C o v ( y T , y T ) ) = ( ( 1 + θ 1 2 ) σ ϵ 2 θ 1 σ ϵ 2 0 … 0 θ 1 σ ϵ 2 ( 1 + θ 1 2 ) σ ϵ 2 θ 1 σ ϵ 2 … 0 0 θ 1 σ ϵ 2 ( 1 + θ 1 ) σ ϵ 2 … 0 ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 … θ 1 σ ϵ 2 0 0 0 … ( 1 + θ 1 ) σ ϵ 2 ) \Sigma=\left(\begin{matrix} Cov(y_1,y_1)&Cov(y_1,y_2)&Cov(y_1,y_3)&\ldots&Cov(y_1,y_T) \\ Cov(y_2,y_1)&Cov(y_2,y_2)&Cov(y_2,y_3)&\ldots&Cov(y_2,y_T) \\ Cov(y_3,y_1)&Cov(y_3,y_2)&Cov(y_3,y_3)&\ldots&Cov(y_3,y_T) \\ \vdots&\vdots&\vdots&\ddots&\vdots \\ Cov(y_T,y_1)&Cov(y_T,y_2)&Cov(y_T,y_3)&\ldots&Cov(y_T,y_T) \end{matrix}\right)=\left(\begin{matrix} (1+\theta^2_1)\sigma_\epsilon^2&\theta_1\sigma_\epsilon^2&0&\ldots&0 \\ \theta_1\sigma_\epsilon^2&(1+\theta_1^2)\sigma_\epsilon^2&\theta_1\sigma_\epsilon^2&\ldots&0 \\ 0&\theta_1\sigma_\epsilon^2&(1+\theta_1)\sigma_\epsilon^2&\ldots&0 \\ \vdots&\vdots&\vdots&\ddots&\vdots \\ 0&0&0&\ldots&\theta_1\sigma_\epsilon^2 \\ 0&0&0&\ldots&(1+\theta_1)\sigma_\epsilon^2 \end{matrix}\right) Σ= Cov(y1,y1)Cov(y2,y1)Cov(y3,y1)⋮Cov(yT,y1)Cov(y1,y2)Cov(y2,y2)Cov(y3,y2)⋮Cov(yT,y2)Cov(y1,y3)Cov(y2,y3)Cov(y3,y3)⋮Cov(yT,y3)………⋱…Cov(y1,yT)Cov(y2,yT)Cov(y3,yT)⋮Cov(yT,yT) = (1+θ12)σϵ2θ1σϵ20⋮00θ1σϵ2(1+θ12)σϵ2θ1σϵ2⋮000θ1σϵ2(1+θ1)σϵ2⋮00………⋱……000⋮θ1σϵ2(1+θ1)σϵ2
C o v ( y t , y t ) = V a r ( y t ) = V a r ( θ 0 + θ 1 ϵ t − 1 + ϵ t ) = ( 1 + θ 1 ) σ ϵ 2 Cov(y_t,y_t)=Var(y_t)=Var(\theta_0+\theta_1\epsilon_{t-1}+\epsilon_t)=(1+\theta_1)\sigma_\epsilon^2 Cov(yt,yt)=Var(yt)=Var(θ0+θ1ϵt−1+ϵt)=(1+θ1)σϵ2
C o v ( y t , y t − 1 ) = C o v ( θ 0 + θ 1 ϵ t − 1 + ϵ t , θ 0 + θ 1 ϵ t − 2 + ϵ t − 1 ) = C o v ( θ 1 ϵ t − 1 , ϵ t − 1 ) = θ 1 σ ϵ 2 Cov(y_t,y_{t-1})=Cov(\theta_0+\theta_1\epsilon_{t-1}+\epsilon_t,\theta_0+\theta_1\epsilon_{t-2}+\epsilon_{t-1})=Cov(\theta_1\epsilon_{t-1},\epsilon_{t-1})=\theta_1\sigma_\epsilon^2 Cov(yt,yt−1)=Cov(θ0+θ1ϵt−1+ϵt,θ0+θ1ϵt−2+ϵt−1)=Cov(θ1ϵt−1,ϵt−1)=θ1σϵ2
C o v ( y t , y t − k ) = 0 ( k > 1 ) Cov(y_t,y_{t-k})=0\quad(k>1) Cov(yt,yt−k)=0(k>1)
那么,似然函数便可得到:
L
(
θ
)
=
f
θ
(
y
1
,
…
,
y
T
)
=
1
(
2
π
)
T
∣
Σ
∣
e
x
p
(
−
1
2
(
y
−
μ
)
′
Σ
−
1
(
y
−
μ
)
)
\begin{align*} L(\theta)&=f_\theta(y_1,\ldots,y_T) \\ &=\dfrac{1}{\sqrt{(2\pi)^T|\Sigma|}}exp\left(-\dfrac{1}{2}(y-\mu)^\prime\Sigma^{-1}(y-\mu)\right) \end{align*}
L(θ)=fθ(y1,…,yT)=(2π)T∣Σ∣1exp(−21(y−μ)′Σ−1(y−μ))
条件极大似然估计
给定
ϵ
0
\epsilon_0
ϵ0的情况下,可以迭代地计算出
ϵ
1
,
…
,
ϵ
T
−
1
\epsilon_1,\ldots,\epsilon_{T-1}
ϵ1,…,ϵT−1,因此:
L
(
θ
)
=
f
θ
(
y
1
,
…
,
y
T
∣
ϵ
0
)
=
f
θ
(
y
1
,
…
,
y
T
∣
ϵ
T
−
1
,
…
,
ϵ
0
)
=
∏
t
=
2
T
f
θ
(
y
t
∣
y
t
−
1
,
…
,
y
0
,
ϵ
T
−
1
,
…
,
ϵ
0
)
f
θ
(
y
1
∣
ϵ
T
−
1
,
…
,
ϵ
0
)
=
∏
t
=
2
T
f
θ
(
y
t
∣
ϵ
t
−
1
)
f
θ
(
y
1
∣
ϵ
0
)
=
∏
t
=
1
T
f
θ
(
y
t
∣
ϵ
t
−
1
)
\begin{align*} L(\theta)&=f_\theta(y_1,\ldots,y_T|\epsilon_0) \\ &=f_\theta(y_1,\ldots,y_T|\epsilon_{T-1},\ldots,\epsilon_0) \\ &=\prod\limits_{t=2}^Tf_\theta(y_t|y_{t-1},\ldots,y_0,\epsilon_{T-1},\ldots,\epsilon_0)f_\theta(y_1|\epsilon_{T-1},\ldots,\epsilon_0) \\ &=\prod\limits_{t=2}^Tf_\theta(y_t|\epsilon_{t-1})f_\theta(y_1|\epsilon_0)=\prod\limits_{t=1}^Tf_\theta(y_t|\epsilon_{t-1}) \end{align*}
L(θ)=fθ(y1,…,yT∣ϵ0)=fθ(y1,…,yT∣ϵT−1,…,ϵ0)=t=2∏Tfθ(yt∣yt−1,…,y0,ϵT−1,…,ϵ0)fθ(y1∣ϵT−1,…,ϵ0)=t=2∏Tfθ(yt∣ϵt−1)fθ(y1∣ϵ0)=t=1∏Tfθ(yt∣ϵt−1)其中,
y
t
∣
ϵ
t
−
1
∼
N
(
θ
0
+
θ
1
ϵ
t
−
1
,
σ
ϵ
2
)
y_t|\epsilon_{t-1}\sim N(\theta_0+\theta_1\epsilon_{t-1},\sigma_\epsilon^2)
yt∣ϵt−1∼N(θ0+θ1ϵt−1,σϵ2)
E ( y t ∣ ϵ t − 1 ) = θ 0 + θ 1 ϵ t − 1 + E ϵ t = θ 0 + θ 1 ϵ t − 1 E(y_t|\epsilon_{t-1})=\theta_0+\theta_1\epsilon_{t-1}+E\epsilon_t=\theta_0+\theta_1\epsilon_{t-1} E(yt∣ϵt−1)=θ0+θ1ϵt−1+Eϵt=θ0+θ1ϵt−1
V a r ( y t ∣ ϵ t − 1 ) = V a r ( ϵ t ) = σ ϵ 2 Var(y_t|\epsilon_{t-1})=Var(\epsilon_t)=\sigma_\epsilon^2 Var(yt∣ϵt−1)=Var(ϵt)=σϵ2
因此,似然函数可以写为:
L
(
θ
)
=
∏
t
=
1
T
1
2
π
σ
ϵ
2
e
x
p
(
−
(
y
t
−
θ
0
−
θ
1
ϵ
t
−
1
)
2
2
σ
ϵ
2
)
L(\theta)=\prod\limits_{t=1}^T\dfrac{1}{\sqrt{2\pi\sigma_\epsilon^2}}exp(-\dfrac{(y_t-\theta_0-\theta_1\epsilon_{t-1})^2}{2\sigma_\epsilon^2})
L(θ)=t=1∏T2πσϵ21exp(−2σϵ2(yt−θ0−θ1ϵt−1)2)