假设
- 线性
{ Y t , X t ′ } \{Y_t,X_t'\} {Yt,Xt′}是一个可观测的随机样本,且
Y t = X t ′ β 0 + ε t , t = 1 , . . , n Y_t=X_t'\beta^0+\varepsilon_t,t=1,..,n Yt=Xt′β0+εt,t=1,..,n - 严格外生
E ( ε t ∣ X ) = 0 E(\varepsilon_t|X)=0 E(εt∣X)=0
矩阵 X X X包含所有的自变量向量 X 1 , X 2 , . . . , X n X_1,X_2,...,X_n X1,X2,...,Xn - 非奇异性
(1) K ∗ K K*K K∗K方阵 X ′ X = ∑ X t X t ′ X'X=\sum X_t X_t' X′X=∑XtXt′是非奇异的。
(2) 当 n → ∞ n\rightarrow\infty n→∞时, X ′ X X'X X′X的最小特征值也为无穷的概率为1。 - 球形误方差
(1)条件同方差
E ( ε t 2 ) = σ 2 > 0 , t = 1 , . . , n E(\varepsilon_t^2)=\sigma^2>0,t=1,..,n E(εt2)=σ2>0,t=1,..,n
(2)条件不相关
E ( ε t ε s ∣ X ) = 0 , t ≠ s E(\varepsilon_t\varepsilon_s|X)=0,t\not=s E(εtεs∣X)=0,t=s
值得注意的是给定假设2,即 E ( ε t ∣ X ) = 0 E(\varepsilon_t|X)=0 E(εt∣X)=0,那么
v a r ( ε t ∣ X ) = E ( ε t 2 ∣ X ) − [ E ( ε ∣ X ) ] 2 = σ 2 var(\varepsilon_t|X)=E(\varepsilon_t^2|X)-[E(\varepsilon|X)]^2=\sigma^2 var(εt∣X)=E(εt2∣X)−[E(ε∣X)]2=σ2
这意味着条件同方差。
估计
残差平方和(sum of squared residuals)为
S
S
R
(
β
)
=
(
Y
−
X
β
)
′
(
Y
−
X
β
)
=
∑
(
Y
t
−
X
t
′
β
)
2
SSR(\beta)=(Y-X\beta)'(Y-X\beta)=\sum(Y_t-X_t'\beta)^2
SSR(β)=(Y−Xβ)′(Y−Xβ)=∑(Yt−Xt′β)2
OLS估计就是选择
β
\beta
β使得SSR最小,那么我们对SSR求导令其为0:
d
S
S
R
d
β
=
∑
2
(
Y
t
−
X
t
′
β
)
d
(
Y
t
−
X
t
′
β
)
d
β
=
−
2
∑
(
Y
t
−
X
t
′
β
)
X
t
=
−
2
X
′
(
Y
−
X
β
)
=
0
\frac{d SSR}{d\beta}=\sum 2(Y_t-X_t'\beta)\frac{d(Y_t-X_t'\beta)}{d\beta}\\=-2\sum (Y_t-X_t'\beta)X_t=-2X'(Y-X\beta)=0
dβdSSR=∑2(Yt−Xt′β)dβd(Yt−Xt′β)=−2∑(Yt−Xt′β)Xt=−2X′(Y−Xβ)=0
解得
β
^
=
(
X
′
X
)
−
1
X
′
Y
\hat \beta=(X'X)^{-1}X'Y
β^=(X′X)−1X′Y
估计量的性质
- 无偏性
β ^ − β 0 = ( X ′ X ) − 1 X ′ Y − β 0 = ( X ′ X ) − 1 X ′ ( X β 0 + ε ) − β 0 = ( X ′ X ) − 1 X ′ ε \hat\beta-\beta^0=(X'X)^{-1}X'Y-\beta^0\\=(X'X)^{-1}X'(X\beta^0+\varepsilon)-\beta^0=(X'X)^{-1}X'\varepsilon β^−β0=(X′X)−1X′Y−β0=(X′X)−1X′(Xβ0+ε)−β0=(X′X)−1X′ε
则
E ( β ^ − β 0 ) = ( X ′ X ) − 1 X ′ E ( ε ) = 0 E(\hat\beta-\beta^0)=(X'X)^{-1}X'E(\varepsilon)=0 E(β^−β0)=(X′X)−1X′E(ε)=0 - 估计量方差
已知 β ^ − β 0 = ( X ′ X ) − 1 X ′ ε \hat\beta-\beta^0=(X'X)^{-1}X'\varepsilon β^−β0=(X′X)−1X′ε 和 E ( ε ε ′ ∣ X ) = σ 2 I E(\varepsilon\varepsilon'|X)=\sigma^2I E(εε′∣X)=σ2I,则
v a r ( β ^ ∣ X ) = E { [ β ^ − E ( β ^ ∣ X ) ] [ β ^ − E ( β ^ ∣ X ) ] ′ ∣ X } = E [ ( β ^ − β 0 ) ( β ^ − β 0 ) ′ ∣ X ] = ( X ′ X ) − 1 X ′ E ( ε ε ′ ∣ X ) X ( X ′ X ) − 1 = σ 2 ( X ′ X ) − 1 var(\hat\beta|X)=E \{ [\hat\beta-E(\hat\beta|X)][\hat\beta-E(\hat\beta|X)]'|X\}\\=E[(\hat\beta-\beta^0)(\hat\beta-\beta^0)'|X]\\=(X'X)^{-1}X'E(\varepsilon\varepsilon'|X)X(X'X)^{-1} \\=\sigma^2(X'X)^{-1} var(β^∣X)=E{[β^−E(β^∣X)][β^−E(β^∣X)]′∣X}=E[(β^−β0)(β^−β0)′∣X]=(X′X)−1X′E(εε′∣X)X(X′X)−1=σ2(X′X)−1 - 残差方差估计量
定义 P = X ( X ′ X ) − 1 X P=X(X'X)^{-1}X P=X(X′X)−1X和 M = I n − P M=I_n-P M=In−P
则
e = Y − X β ^ = Y − ( X ′ X ) − 1 X Y = [ I − ( X ′ X ) − 1 X ] Y = M Y = M ( X β 0 + ε ) = M ε e=Y-X\hat\beta=Y-(X'X)^{-1}XY\\=[I-(X'X)^{-1}X]Y=MY\\=M(X\beta^0+\varepsilon)=M\varepsilon e=Y−Xβ^=Y−(X′X)−1XY=[I−(X′X)−1X]Y=MY=M(Xβ0+ε)=Mε
那么 e ′ e = ( M ε ) ′ M ε = ε ′ M ε e'e=(M\varepsilon)'M\varepsilon=\varepsilon'M\varepsilon e′e=(Mε)′Mε=ε′Mε。
E ( e ′ e ) = E ( ε ′ M ε ∣ X ) = E [ t r ( ε ε ′ M ) ∣ X ] = σ 2 t r ( M ) = σ 2 ( n − K ) E(e'e)=E(\varepsilon'M\varepsilon|X)=E[tr(\varepsilon\varepsilon'M)|X]\\=\sigma^2tr(M)=\sigma^2(n-K) E(e′e)=E(ε′Mε∣X)=E[tr(εε′M)∣X]=σ2tr(M)=σ2(n−K)
残差方差估计量为 e ′ e / ( n − K ) e'e/(n-K) e′e/(n−K)