M A ( q ) MA(q) MA(q)的参数估计——公式推导
设 M A ( q ) MA(q) MA(q)模型为 y t = θ 0 + θ 1 ϵ t − 1 + … + θ q ϵ t − q + ϵ t y_t=\theta_0+\theta_1\epsilon_{t-1}+\ldots+\theta_q\epsilon_{t-q}+\epsilon_t yt=θ0+θ1ϵt−1+…+θqϵt−q+ϵt,其中 ϵ t ∼ i . i . d . N ( 0 , σ ϵ 2 ) \epsilon_t\mathop{\sim}\limits^{i.i.d.}N(0,\sigma_\epsilon^2) ϵt∼i.i.d.N(0,σϵ2)
极大似然估计
假设观测数据集为
{
y
1
,
…
,
y
T
}
\{y_1,\ldots,y_T\}
{y1,…,yT},令
θ
=
(
θ
0
,
…
,
θ
q
,
σ
ϵ
2
)
′
\theta=(\theta_0,\ldots,\theta_q,\sigma_\epsilon^2)^\prime
θ=(θ0,…,θq,σϵ2)′,
y
=
(
y
1
,
…
,
y
T
)
′
y=(y_1,\ldots,y_T)^\prime
y=(y1,…,yT)′,
似然函数为:
L
(
θ
)
=
f
θ
(
y
1
,
…
,
y
T
)
L(\theta)=f_\theta(y_1,\ldots,y_T)
L(θ)=fθ(y1,…,yT)
其中,
(
y
1
,
…
,
y
T
)
∼
N
(
μ
,
Σ
)
(y_1,\ldots,y_T)\sim N(\mu,\Sigma)
(y1,…,yT)∼N(μ,Σ)
在精确似然估计中,我们假设 ( y 1 , … , y T ) (y_1,\ldots,y_T) (y1,…,yT)服从多元正态分布
μ = ( E y 1 , … , E y T ) ′ = ( θ 0 , … , θ 0 ) ′ \mu=(Ey_1,\ldots,Ey_T)^\prime=(\theta_0,\ldots,\theta_0)^\prime μ=(Ey1,…,EyT)′=(θ0,…,θ0)′
Σ = ( C o v ( y 1 , y 1 ) C o v ( y 1 , y 2 ) … C o v ( y 1 , y q + 1 ) C o v ( y 1 , y q + 2 ) … C o v ( y 1 , y T − 1 ) C o v ( y 1 , y T ) C o v ( y 2 , y 1 ) C o v ( y 2 , y 2 ) … C o v ( y 2 , y q + 1 ) C o v ( y 2 , y q + 2 ) … C o v ( y 2 , y T − 1 ) C o v ( y 2 , y T ) ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ C o v ( y q + 1 , y 1 ) C o v ( y q + 1 , y 2 ) … C o v ( y q + 1 , y q + 1 ) C o v ( y q + 1 , y q + 2 ) … C o v ( y q + 1 , y T − 1 ) C o v ( y q + 1 , y T ) C o v ( y q + 2 , y 1 ) C o v ( y q + 2 , y 2 ) … C o v ( y q + 2 , y q + 1 ) C o v ( y q + 2 , y q + 2 ) … C o v ( y q + 2 , y T − 1 ) C o v ( y q + 2 , y T ) ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ C o v ( y T − 1 , y 1 ) C o v ( y T − 1 , y 2 ) … C o v ( y T − 1 , y q + 1 ) C o v ( y T − 1 , y q + 2 ) … C o v ( y T − 1 , y T − 1 ) C o v ( y T − 1 , y T ) C o v ( y T , y 1 ) C o v ( y T , y 2 ) … C o v ( y T , y q + 1 ) C o v ( y T , y q + 2 ) … C o v ( y T , y T − 1 ) C o v ( y T , y T ) ) = ( ( 1 + ∑ t = 1 q θ t 2 ) σ ϵ 2 θ 1 σ ϵ 2 + ∑ t = 1 q − 1 θ t θ t + 1 σ ϵ 2 … θ p σ ϵ 2 0 … 0 0 θ 1 σ ϵ 2 + ∑ t = 1 q − 1 θ t θ t + 1 σ ϵ 2 ( 1 + ∑ t = 1 q θ t 2 ) σ ϵ 2 … θ q − 1 σ ϵ 2 + θ 1 θ 2 σ ϵ 2 θ p σ ϵ 2 … 0 0 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ θ p σ ϵ 2 θ q − 1 σ ϵ 2 + θ 1 θ 2 σ ϵ 2 … ( 1 + ∑ t = 1 q θ t 2 ) σ ϵ 2 θ 1 σ ϵ 2 + ∑ t = 1 q − 1 θ t θ t + 1 σ ϵ 2 … C o v ( y q + 1 , y T − 1 ) C o v ( y q + 1 , y T ) 0 θ p σ ϵ 2 … θ 1 σ ϵ 2 + ∑ t = 1 q − 1 θ t θ t + 1 σ ϵ 2 ( 1 + ∑ t = 1 q θ t 2 ) σ ϵ 2 … C o v ( y q + 2 , y T − 1 ) C o v ( y q + 2 , y T ) ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 0 … C o v ( y T − 1 , y q + 1 ) C o v ( y T − 1 , y q + 2 ) … ( 1 + ∑ t = 1 q θ t 2 ) σ ϵ 2 θ 1 σ ϵ 2 + ∑ t = 1 q − 1 θ t θ t + 1 σ ϵ 2 0 0 … C o v ( y T , y q + 1 ) C o v ( y T , y q + 2 ) … θ 1 σ ϵ 2 + ∑ t = 1 q − 1 θ t θ t + 1 σ ϵ 2 ( 1 + ∑ t = 1 q θ t 2 ) σ ϵ 2 ) \begin{align*} \Sigma&=\left(\begin{matrix} Cov(y_1,y_1)&Cov(y_1,y_2)&\ldots&Cov(y_1,y_{q+1})&Cov(y_1,y_{q+2})&\ldots&Cov(y_1,y_{T-1})&Cov(y_1,y_T) \\ Cov(y_2,y_1)&Cov(y_2,y_2)&\ldots&Cov(y_2,y_{q+1})&Cov(y_2,y_{q+2})&\ldots&Cov(y_2,y_{T-1})&Cov(y_2,y_T) \\ \vdots&\vdots&&\vdots&\vdots&&\vdots&\vdots \\ Cov(y_{q+1},y_1)&Cov(y_{q+1},y_2)&\ldots&Cov(y_{q+1},y_{q+1})&Cov(y_{q+1},y_{q+2})&\ldots&Cov(y_{q+1},y_{T-1})&Cov(y_{q+1},y_T) \\ Cov(y_{q+2},y_1)&Cov(y_{q+2},y_2)&\ldots&Cov(y_{q+2},y_{q+1})&Cov(y_{q+2},y_{q+2})&\ldots&Cov(y_{q+2},y_{T-1})&Cov(y_{q+2},y_T) \\ \vdots&\vdots&&\vdots&\vdots&&\vdots&\vdots \\ Cov(y_{T-1},y_1)&Cov(y_{T-1},y_2)&\ldots&Cov(y_{T-1},y_{q+1})&Cov(y_{T-1},y_{q+2})&\ldots&Cov(y_{T-1},y_{T-1})&Cov(y_{T-1},y_T) \\Cov(y_T,y_1)&Cov(y_T,y_2)&\ldots&Cov(y_T,y_{q+1})&Cov(y_T,y_{q+2})&\ldots&Cov(y_T,y_{T-1})&Cov(y_T,y_T) \end{matrix}\right) \\ \\&=\left(\begin{matrix} (1+\sum\limits_{t=1}^{q}\theta_t^2)\sigma_\epsilon^2&\theta_1\sigma_\epsilon^2+\sum\limits_{t=1}^{q-1}\theta_t\theta_{t+1}\sigma_\epsilon^2&\ldots&\theta_p\sigma_\epsilon^2&0&\ldots&0&0\\ \theta_1\sigma_\epsilon^2+\sum\limits_{t=1}^{q-1}\theta_t\theta_{t+1}\sigma_\epsilon^2&(1+\sum\limits_{t=1}^{q}\theta_t^2)\sigma_\epsilon^2&\ldots&\theta_{q-1}\sigma_\epsilon^2+\theta_1\theta_2\sigma_\epsilon^2&\theta_p\sigma_\epsilon^2&\ldots&0&0 \\ \vdots&\vdots&&\vdots&\vdots&&\vdots&\vdots \\ \theta_p\sigma_\epsilon^2&\theta_{q-1}\sigma_\epsilon^2+\theta_1\theta_2\sigma_\epsilon^2&\ldots&(1+\sum\limits_{t=1}^{q}\theta_t^2)\sigma_\epsilon^2&\theta_1\sigma_\epsilon^2+\sum\limits_{t=1}^{q-1}\theta_t\theta_{t+1}\sigma_\epsilon^2&\ldots&Cov(y_{q+1},y_{T-1})&Cov(y_{q+1},y_T) \\ 0&\theta_p\sigma_\epsilon^2&\ldots&\theta_1\sigma_\epsilon^2+\sum\limits_{t=1}^{q-1}\theta_t\theta_{t+1}\sigma_\epsilon^2&(1+\sum\limits_{t=1}^{q}\theta_t^2)\sigma_\epsilon^2&\ldots&Cov(y_{q+2},y_{T-1})&Cov(y_{q+2},y_T) \\ \vdots&\vdots&&\vdots&\vdots&&\vdots&\vdots \\ 0&0&\ldots&Cov(y_{T-1},y_{q+1})&Cov(y_{T-1},y_{q+2})&\ldots&(1+\sum\limits_{t=1}^{q}\theta_t^2)\sigma_\epsilon^2&\theta_1\sigma_\epsilon^2+\sum\limits_{t=1}^{q-1}\theta_t\theta_{t+1}\sigma_\epsilon^2\\0&0&\ldots&Cov(y_T,y_{q+1})&Cov(y_T,y_{q+2})&\ldots&\theta_1\sigma_\epsilon^2+\sum\limits_{t=1}^{q-1}\theta_t\theta_{t+1}\sigma_\epsilon^2&(1+\sum\limits_{t=1}^{q}\theta_t^2)\sigma_\epsilon^2 \end{matrix}\right) \end{align*} Σ= Cov(y1,y1)Cov(y2,y1)⋮Cov(yq+1,y1)Cov(yq+2,y1)⋮Cov(yT−1,y1)Cov(yT,y1)Cov(y1,y2)Cov(y2,y2)⋮Cov(yq+1,y2)Cov(yq+2,y2)⋮Cov(yT−1,y2)Cov(yT,y2)………………Cov(y1,yq+1)Cov(y2,yq+1)⋮Cov(yq+1,yq+1)Cov(yq+2,yq+1)⋮Cov(yT−1,yq+1)Cov(yT,yq+1)Cov(y1,yq+2)Cov(y2,yq+2)⋮Cov(yq+1,yq+2)Cov(yq+2,yq+2)⋮Cov(yT−1,yq+2)Cov(yT,yq+2)………………Cov(y1,yT−1)Cov(y2,yT−1)⋮Cov(yq+1,yT−1)Cov(yq+2,yT−1)⋮Cov(yT−1,yT−1)Cov(yT,yT−1)Cov(y1,yT)Cov(y2,yT)⋮Cov(yq+1,yT)Cov(yq+2,yT)⋮Cov(yT−1,yT)Cov(yT,yT) = (1+t=1∑qθt2)σϵ2θ1σϵ2+t=1∑q−1θtθt+1σϵ2⋮θpσϵ20⋮00θ1σϵ2+t=1∑q−1θtθt+1σϵ2(1+t=1∑qθt2)σϵ2⋮θq−1σϵ2+θ1θ2σϵ2θpσϵ2⋮00………………θpσϵ2θq−1σϵ2+θ1θ2σϵ2⋮(1+t=1∑qθt2)σϵ2θ1σϵ2+t=1∑q−1θtθt+1σϵ2⋮Cov(yT−1,yq+1)Cov(yT,yq+1)0θpσϵ2⋮θ1σϵ2+t=1∑q−1θtθt+1σϵ2(1+t=1∑qθt2)σϵ2⋮Cov(yT−1,yq+2)Cov(yT,yq+2)………………00⋮Cov(yq+1,yT−1)Cov(yq+2,yT−1)⋮(1+t=1∑qθt2)σϵ2θ1σϵ2+t=1∑q−1θtθt+1σϵ200⋮Cov(yq+1,yT)Cov(yq+2,yT)⋮θ1σϵ2+t=1∑q−1θtθt+1σϵ2(1+t=1∑qθt2)σϵ2
于是,似然函数便可借由多元正态密度函数推导如下:
L
(
θ
)
=
1
2
π
∣
Σ
∣
e
x
p
(
−
1
2
(
y
−
μ
)
′
Σ
−
1
(
y
−
μ
)
)
L(\theta)=\dfrac{1}{\sqrt{2\pi|\Sigma|}}exp\left( -\dfrac{1}{2}(y-\mu)^\prime\Sigma^{-1}(y-\mu) \right)
L(θ)=2π∣Σ∣1exp(−21(y−μ)′Σ−1(y−μ))
条件极大似然估计
在给定
{
ϵ
1
−
q
,
…
,
ϵ
0
}
\{\epsilon_{1-q},\ldots,\epsilon_0\}
{ϵ1−q,…,ϵ0}的条件下,我们可以迭代计算得到
{
ϵ
1
,
…
,
ϵ
T
−
1
}
\{\epsilon_1,\ldots,\epsilon_{T-1}\}
{ϵ1,…,ϵT−1},因此:
L
(
θ
)
=
f
θ
(
y
1
,
…
,
y
T
∣
ϵ
1
−
q
,
…
,
ϵ
0
)
=
f
θ
(
y
1
,
…
,
y
T
∣
ϵ
1
−
q
,
…
,
ϵ
T
−
1
)
=
∏
t
=
2
T
f
θ
(
y
t
∣
y
t
−
1
,
…
,
y
1
,
ϵ
1
−
q
,
…
,
ϵ
T
−
1
)
f
θ
(
y
1
∣
ϵ
1
−
q
,
…
,
ϵ
T
−
1
)
=
∏
t
=
1
T
f
θ
(
y
t
∣
ϵ
t
−
1
,
…
,
ϵ
t
−
q
)
\begin{align*}L(\theta)&=f_\theta(y_1,\ldots,y_T|\epsilon_{1-q},\ldots,\epsilon_{0}) \\ &=f_\theta(y_1,\ldots,y_T|\epsilon_{1-q},\ldots,\epsilon_{T-1}) \\ &=\prod\limits_{t=2}^Tf_\theta(y_t|y_{t-1},\ldots,y_1,\epsilon_{1-q},\ldots,\epsilon_{T-1})f_\theta(y_1|\epsilon_{1-q},\ldots,\epsilon_{T-1}) \\ &=\prod\limits_{t=1}^Tf_\theta(y_t|\epsilon_{t-1},\ldots,\epsilon_{t-q}) \end{align*}
L(θ)=fθ(y1,…,yT∣ϵ1−q,…,ϵ0)=fθ(y1,…,yT∣ϵ1−q,…,ϵT−1)=t=2∏Tfθ(yt∣yt−1,…,y1,ϵ1−q,…,ϵT−1)fθ(y1∣ϵ1−q,…,ϵT−1)=t=1∏Tfθ(yt∣ϵt−1,…,ϵt−q)其中,
y
t
∣
ϵ
t
−
1
,
…
,
ϵ
t
−
q
∼
N
(
θ
0
+
∑
i
=
1
q
θ
i
ϵ
t
−
i
,
σ
ϵ
2
)
y_t|\epsilon_{t-1},\ldots,\epsilon_{t-q}\sim N(\theta_0+\sum\limits_{i=1}^q\theta_i\epsilon_{t-i},\sigma_\epsilon^2)
yt∣ϵt−1,…,ϵt−q∼N(θ0+i=1∑qθiϵt−i,σϵ2)
故似然函数为:
L
(
θ
)
=
∏
t
=
1
T
1
2
π
σ
ϵ
2
e
x
p
(
−
(
y
t
−
θ
0
−
∑
i
=
1
q
θ
i
ϵ
t
−
i
)
2
2
σ
ϵ
2
)
L(\theta)=\prod\limits_{t=1}^T\dfrac{1}{\sqrt{2\pi\sigma_\epsilon^2}}exp\left( -\dfrac{(y_t-\theta_0-\sum\limits_{i=1}^q\theta_i\epsilon_{t-i})^2}{2\sigma_\epsilon^2} \right)
L(θ)=t=1∏T2πσϵ21exp
−2σϵ2(yt−θ0−i=1∑qθiϵt−i)2