模型
随机变量 y y y与一般变量 x 1 , x 2 , . . . , x p x_1,x_2,...,x_p x1,x2,...,xp有如下关系 y = b 0 + x 1 b 1 + x 2 b 2 + . . . + x p b p + ϵ y=b_0+x_1b_1+x_2b_2+...+x_pb_p+\epsilon y=b0+x1b1+x2b2+...+xpbp+ϵ一般有 n n n组观测数据,回归模型可表示为 y i = b 0 + x i 1 b 1 + x i 2 b 2 + . . . + x i p b p + ϵ i , i = 1 , 2 , . . . , n y_i=b_0+x_{i1}b_1+x_{i2}b_2+...+x_{ip}b_p+\epsilon_i,i=1,2,...,n yi=b0+xi1b1+xi2b2+...+xipbp+ϵi,i=1,2,...,n写成矩阵形式为 y = X b + ϵ (1) \boldsymbol{y}=\boldsymbol{Xb}+\boldsymbol\epsilon \tag{1} y=Xb+ϵ(1)其中 y = [ y 1 , y 2 , . . . , y n ] T , b = [ b 0 , b 1 , b 2 , . . . , b p ] T , ϵ = [ ϵ 1 , ϵ 2 , . . . , ϵ n ] T , X = [ x 0 , x 1 , x 2 , . . . , x p ] \boldsymbol{y}=[y_1,y_2,...,y_n]^\mathrm{T},\boldsymbol{b}=[b_0,b_1,b_2,...,b_p]^\mathrm{T},\boldsymbol\epsilon=[\epsilon_1,\epsilon_2,...,\epsilon_n]^\mathrm{T},\boldsymbol{X}=[\boldsymbol{x_0},\boldsymbol{x_1},\boldsymbol{x_2},...,\boldsymbol{x_p}] y=[y1,y2,...,yn]T,b=[b0,b1,b2,...,bp]T,ϵ=[ϵ1,ϵ2,...,ϵn]T,X=[x0,x1,x2,...,xp],这里, x 0 = 1 = [ 1 , 1 , . . . , 1 ] n T , x j = [ x 1 j , x 2 j , . . . , x n j ] T ( j = 1 , 2 , . . . , n ) \boldsymbol{x_0}=\boldsymbol{1}=[1,1,...,1]_{n}^\mathrm{T},\boldsymbol{x_j}=[x_{1j},x_{2j},...,x_{nj}]^\mathrm{T}(j=1,2,...,n) x0=1=[1,1,...,1]nT,xj=[x1j,x2j,...,xnj]T(j=1,2,...,n)
基本假定
- ϵ ∼ N ( 0 , σ 2 I n ) \boldsymbol\epsilon\sim N(\boldsymbol{0},\sigma^2\boldsymbol{I}_n) ϵ∼N(0,σ2In),即 E ( ϵ ) = 0 , v a r ( ϵ ) = σ 2 I n E(\boldsymbol\epsilon)=\boldsymbol{0},var(\boldsymbol\epsilon)=\sigma^2\boldsymbol{I}_n E(ϵ)=0,var(ϵ)=σ2In
- x 0 , x 1 , x 2 , . . . , x p \boldsymbol{x_0},\boldsymbol{x_1},\boldsymbol{x_2},...,\boldsymbol{x_p} x0,x1,x2,...,xp线性无关,即 r a n k ( X ) = p + 1 rank(\boldsymbol{X})=p+1 rank(X)=p+1
由1可得出,
E
(
y
)
=
X
b
+
E
(
ϵ
)
=
X
b
,
v
a
r
(
y
)
=
E
(
y
−
E
(
y
)
)
(
y
−
E
(
y
)
)
T
=
E
(
ϵ
ϵ
T
)
=
σ
2
I
n
E(\boldsymbol{y})=\boldsymbol{Xb}+E(\boldsymbol{\boldsymbol\epsilon})=\boldsymbol{Xb},var(\boldsymbol{y})=E(\boldsymbol{y}-E(\boldsymbol{y}))(\boldsymbol{y}-E(\boldsymbol{y}))^\mathrm{T}=E(\boldsymbol\epsilon\boldsymbol\epsilon^\mathrm{T})=\sigma^2\boldsymbol{I}_n
E(y)=Xb+E(ϵ)=Xb,var(y)=E(y−E(y))(y−E(y))T=E(ϵϵT)=σ2In
由2可知,对于幂等矩阵
H
=
X
(
X
T
X
)
−
1
X
T
,
r
a
n
k
(
H
)
=
p
+
1
\boldsymbol{H}=\boldsymbol{X}(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T},rank(\boldsymbol{H})=p+1
H=X(XTX)−1XT,rank(H)=p+1
参数估计
使用最小二乘法可得到
b
\boldsymbol{b}
b的估计值
b
^
=
(
X
T
X
)
−
1
X
T
y
(2)
\boldsymbol{\hat{b}}=(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}\boldsymbol{y}\tag{2}
b^=(XTX)−1XTy(2)因为
y
=
X
b
+
ϵ
\boldsymbol{y}=\boldsymbol{Xb}+\boldsymbol\epsilon
y=Xb+ϵ,那么
b
^
=
(
X
T
X
)
−
1
X
T
y
=
(
X
T
X
)
−
1
X
T
(
X
b
+
ϵ
)
=
b
+
(
X
T
X
)
−
1
X
T
ϵ
\boldsymbol{\hat{b}}=(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}\boldsymbol{y}=(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}(\boldsymbol{Xb}+\boldsymbol\epsilon)=\boldsymbol{b}+(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}\boldsymbol\epsilon
b^=(XTX)−1XTy=(XTX)−1XT(Xb+ϵ)=b+(XTX)−1XTϵ即
b
^
=
b
+
(
X
T
X
)
−
1
X
T
ϵ
(3)
\boldsymbol{\hat{b}}=\boldsymbol{b}+(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}\boldsymbol\epsilon\tag{3}
b^=b+(XTX)−1XTϵ(3)
因此,
E
(
b
^
)
=
b
(4)
E(\boldsymbol{\hat{b}})=\boldsymbol{b}\tag{4}
E(b^)=b(4)
v
a
r
(
b
^
)
=
E
(
(
X
T
X
)
−
1
X
T
ϵ
ϵ
T
X
(
X
T
X
)
−
1
)
=
σ
2
(
X
T
X
)
−
1
(5)
var(\boldsymbol{\hat{b}})=E((\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}\boldsymbol\epsilon\boldsymbol\epsilon^\mathrm{T}\boldsymbol{X}(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1})=\sigma^2(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\tag{5}
var(b^)=E((XTX)−1XTϵϵTX(XTX)−1)=σ2(XTX)−1(5)
另一方面,
y
^
=
X
b
^
=
X
(
X
T
X
)
−
1
X
T
y
=
H
y
(6)
\boldsymbol{\hat{y}}=\boldsymbol{X}\boldsymbol{\hat{b}}=\boldsymbol{X}(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T}\boldsymbol{y}=\boldsymbol{H}\boldsymbol{y}\tag{6}
y^=Xb^=X(XTX)−1XTy=Hy(6)
y
^
=
X
b
^
=
X
b
+
H
ϵ
(7)
\boldsymbol{\hat{y}}=\boldsymbol{X}\boldsymbol{\hat{b}}=\boldsymbol{X}\boldsymbol{b}+\boldsymbol{H}\boldsymbol{\boldsymbol\epsilon}\tag{7}
y^=Xb^=Xb+Hϵ(7)
E
(
y
^
)
=
X
b
=
E
(
y
)
(8)
E(\boldsymbol{\hat{y}})=\boldsymbol{X}\boldsymbol{b}=E(\boldsymbol{y})\tag{8}
E(y^)=Xb=E(y)(8)
v
a
r
(
y
^
)
=
E
(
H
ϵ
ϵ
T
H
)
=
σ
2
H
(9)
var(\boldsymbol{\hat{y}})=E(\boldsymbol{H}\boldsymbol\epsilon\boldsymbol\epsilon^\mathrm{T}\boldsymbol{H})=\sigma^2\boldsymbol{H}\tag{9}
var(y^)=E(HϵϵTH)=σ2H(9)
性质
- 均值 y ‾ = 1 n ∑ i = 1 n y i = 1 n 1 T y = 1 n ∑ i = 1 n y ^ i = 1 n 1 T y ^ \overline{y}=\frac{1}{n}\sum\limits_{i = 1}^ny_i=\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol{y}=\frac{1}{n}\sum\limits_{i = 1}^n\hat{y}_i=\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol{\hat{y}} y=n1i=1∑nyi=n11Ty=n1i=1∑ny^i=n11Ty^
- 总离差平方和
S
S
T
=
∑
i
=
1
n
(
y
i
−
y
‾
)
2
=
(
y
−
y
‾
)
T
(
y
−
y
‾
)
=
y
T
y
−
y
‾
T
y
‾
SST=\sum\limits_{i = 1}^n(y_i-\overline{y})^2=(\boldsymbol{y}-\boldsymbol{\overline{y}})^\mathrm{T}(\boldsymbol{y}-\boldsymbol{\overline{y}})=\boldsymbol{y}^\mathrm{T}\boldsymbol{y}-\boldsymbol{\overline{y}}^\mathrm{T}\boldsymbol{\overline{y}}
SST=i=1∑n(yi−y)2=(y−y)T(y−y)=yTy−yTy
回归平方和 S S R = ∑ i = 1 n ( y ^ i − y ‾ ) 2 = ( y ^ − y ‾ ) T ( y ^ − y ‾ ) = y ^ T y ^ − y ‾ T y ‾ = y T H y − y ‾ T y ‾ SSR=\sum\limits_{i = 1}^n ({\hat{y}}_i - \overline y )^2=(\boldsymbol{\hat{y}}-\boldsymbol{\overline{y}})^\mathrm{T}(\boldsymbol{\hat{y}}-\boldsymbol{\overline{y}})=\boldsymbol{\hat{y}}^\mathrm{T}\boldsymbol{\hat{y}}-\boldsymbol{\overline{y}}^\mathrm{T}\boldsymbol{\overline{y}}=\boldsymbol{y}^\mathrm{T}\boldsymbol{H}\boldsymbol{y}-\boldsymbol{\overline{y}}^\mathrm{T}\boldsymbol{\overline{y}} SSR=i=1∑n(y^i−y)2=(y^−y)T(y^−y)=y^Ty^−yTy=yTHy−yTy
残差平方和 S S E = ∑ i = 1 n ( y i − y ^ ) 2 = ( y − y ^ ) T ( y − y ^ ) = y T ( I − H ) y = y T y − y T H y SSE=\sum\limits_{i = 1}^n(y_i-\hat{y})^2=(\boldsymbol{y}-\boldsymbol{\hat{y}})^\mathrm{T}(\boldsymbol{y}-\boldsymbol{\hat{y}})=\boldsymbol{y}^\mathrm{T}(\boldsymbol{I}-\boldsymbol{H})\boldsymbol{y}=\boldsymbol{y}^\mathrm{T}\boldsymbol{y}-\boldsymbol{y}^\mathrm{T}\boldsymbol{H}\boldsymbol{y} SSE=i=1∑n(yi−y^)2=(y−y^)T(y−y^)=yT(I−H)y=yTy−yTHy
因此 S S E + S S R = S S T SSE+SSR=SST SSE+SSR=SST -
S
S
E
/
(
n
−
p
−
1
)
SSE/(n-p-1)
SSE/(n−p−1)为方差
σ
2
\sigma^2
σ2的无偏估计
证:由公式(1)(7)可知 y − y ^ = X b + ϵ − ( X b + H ϵ ) = ( I − H ) ϵ (10) \boldsymbol{y}-\boldsymbol{\hat{y}}=\boldsymbol{X}\boldsymbol{b}+\boldsymbol{\boldsymbol\epsilon}-(\boldsymbol{X}\boldsymbol{b}+\boldsymbol{H}\boldsymbol{\boldsymbol\epsilon})=(\boldsymbol{I}-\boldsymbol{H})\boldsymbol\epsilon \tag{10} y−y^=Xb+ϵ−(Xb+Hϵ)=(I−H)ϵ(10)
那么 S S E = ϵ T ( I − H ) ( I − H ) ϵ = ϵ T ϵ − ϵ T H ϵ = ∑ i = 1 n ϵ i 2 − ∑ i = 1 n ∑ j = 1 n h i j ϵ i ϵ j (11) SSE=\boldsymbol\epsilon^\mathrm{T}(\boldsymbol{I}-\boldsymbol{H})(\boldsymbol{I}-\boldsymbol{H})\boldsymbol\epsilon=\boldsymbol\epsilon^\mathrm{T}\boldsymbol\epsilon-\boldsymbol\epsilon^\mathrm{T}\boldsymbol{H}\boldsymbol\epsilon=\sum\limits_{i = 1}^n \epsilon_i^2-\sum\limits_{i = 1}^n\sum\limits_{j = 1}^n h_{ij}\epsilon_i\epsilon_j \tag{11} SSE=ϵT(I−H)(I−H)ϵ=ϵTϵ−ϵTHϵ=i=1∑nϵi2−i=1∑nj=1∑nhijϵiϵj(11)这里 h i j h_{ij} hij为 H \boldsymbol{H} H的元素.由假定1可得, E ( ϵ i 2 ) = σ 2 , E ( ϵ i ϵ j ) = 0 ( i ≠ j ) E(\epsilon_i^2)=\sigma^2,E(\epsilon_i\epsilon_j)=0(i\ne j ) E(ϵi2)=σ2,E(ϵiϵj)=0(i=j),因此 E ( S S E ) = E ( ∑ i = 1 n ϵ i 2 ) − E ( ∑ i = 1 n ∑ j = 1 n h i j ϵ i ϵ j ) = n σ 2 − ∑ i = 1 n h i i E ( ϵ i 2 ) = ( n − t r a c e ( H ) ) σ 2 E(SSE)=E(\sum\limits_{i = 1}^n \epsilon_i^2)-E(\sum\limits_{i = 1}^n\sum\limits_{j = 1}^n h_{ij}\epsilon_i\epsilon_j)=n\sigma^2-\sum\limits_{i = 1}^n h_{ii}E(\epsilon_i^2)=(n-trace(\boldsymbol{H}))\sigma^2 E(SSE)=E(i=1∑nϵi2)−E(i=1∑nj=1∑nhijϵiϵj)=nσ2−i=1∑nhiiE(ϵi2)=(n−trace(H))σ2幂等矩阵的秩和迹相等,即 t r a c e ( H ) = r a n k ( H ) = p + 1 trace(\boldsymbol{H})=rank(\boldsymbol{H})=p+1 trace(H)=rank(H)=p+1,那么 E ( S S E ) = ( n − p − 1 ) σ 2 E(SSE)=(n-p-1)\sigma^2 E(SSE)=(n−p−1)σ2该性质得证。 -
S
S
E
/
σ
2
∼
χ
2
(
n
−
p
−
1
)
SSE/\sigma^2\sim\chi^2(n-p-1)
SSE/σ2∼χ2(n−p−1)
证:由公式(11) S S E = ϵ T ϵ − ϵ T H ϵ SSE=\boldsymbol\epsilon^\mathrm{T}\boldsymbol\epsilon-\boldsymbol\epsilon^\mathrm{T}\boldsymbol{H}\boldsymbol\epsilon SSE=ϵTϵ−ϵTHϵ H \boldsymbol{H} H对称,因此含有 n n n个特征值和 n n n个线性无关的特征向量。 H \boldsymbol{H} H幂等,特征值只可能为1和0,其中1的重数为 r a n k ( H ) = p + 1 rank(\boldsymbol{H})=p+1 rank(H)=p+1。又因为 H = X ( X T X ) − 1 X T \boldsymbol{H}=\boldsymbol{X}(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}\boldsymbol{X}^\mathrm{T} H=X(XTX)−1XT而言,得到 H X = X \boldsymbol{H}\boldsymbol{X}=\boldsymbol{X} HX=X也就是说 X \boldsymbol{X} X的列向量 x 0 , x 1 , x 2 , . . . , x p \boldsymbol{x_0},\boldsymbol{x_1},\boldsymbol{x_2},...,\boldsymbol{x_p} x0,x1,x2,...,xp就是 H \boldsymbol{H} H特征值为1的 p + 1 p+1 p+1线性无关的特征向量,可通过施密特正交法得到特征值1的 p + 1 p+1 p+1个标准正交向量 a 1 , a 2 , . . . , a p + 1 \boldsymbol{a_1},\boldsymbol{a_2},...,\boldsymbol{a_{p+1}} a1,a2,...,ap+1(这里下标往后错一位),其中 a 1 = x 0 ∣ x 0 ∣ = [ 1 n , 1 n , . . . , 1 n ] n T \boldsymbol{a_1}=\frac{\boldsymbol{x_0}}{|\boldsymbol{x_0}|}=[\frac{1}{\sqrt{n}},\frac{1}{\sqrt{n}},...,\frac{1}{\sqrt{n}}]_n^\mathrm{T} a1=∣x0∣x0=[n1,n1,...,n1]nT。对于特征值0,可求出 n − p − 1 n-p-1 n−p−1个标准正交特征向量 a p + 2 , a p + 2 , . . . , a n \boldsymbol{a_{p+2}},\boldsymbol{a_{p+2}},...,\boldsymbol{a_{n}} ap+2,ap+2,...,an,令 A = [ a 1 , a 2 , . . . , a n ] \boldsymbol{A}=[\boldsymbol{a_1},\boldsymbol{a_2},...,\boldsymbol{a_n}] A=[a1,a2,...,an],那么, H \boldsymbol{H} H可对角化为 A T H A = Λ = [ 1 1 ⋱ 1 0 ⋱ 0 ] \boldsymbol{A}^\mathrm{T}\boldsymbol{H}\boldsymbol{A}=\Lambda=\begin{bmatrix}1&{}\\{}&{1}\\{}&{}&\ddots\\{}&{}&{}&1\\{}&{}&{}&{}&0\\{}&{}&{}&{}&{}&\ddots\\{}&{}&{}&{}&{}&{}&0\end{bmatrix} ATHA=Λ=⎣ ⎡11⋱10⋱0⎦ ⎤
其中 Λ \Lambda Λ有 p + 1 p+1 p+1个1和 n − p − 1 n-p-1 n−p−1个0
因此,令 η = [ η 1 , η 2 , . . . , η n ] T = A T ϵ (12) \boldsymbol\eta=[\eta_1,\eta_2,...,\eta_n]^\mathrm{T}=\boldsymbol{A}^\mathrm{T}\boldsymbol\epsilon \tag{12} η=[η1,η2,...,ηn]T=ATϵ(12)容易证明 η 2 , η 3 , . . . , η n i . i . d ∼ N ( 0 , σ 2 ) \eta_2,\eta_3,...,\eta_n i.i.d\sim N(0,\sigma^2) η2,η3,...,ηni.i.d∼N(0,σ2),可得 S S E = ϵ T ϵ − ϵ T H ϵ = η T η − η T A T H A η = η T η − η T Λ η = ∑ i = 1 n η i 2 − ∑ i = 1 p + 1 η i 2 = ∑ i = p + 2 n η i 2 SSE=\boldsymbol\epsilon^\mathrm{T}\boldsymbol\epsilon-\boldsymbol\epsilon^\mathrm{T}\boldsymbol{H}\boldsymbol\epsilon=\boldsymbol\eta^\mathrm{T}\boldsymbol\eta-\boldsymbol\eta^\mathrm{T}\boldsymbol{A}^\mathrm{T}\boldsymbol{H}\boldsymbol{A}\boldsymbol\eta=\boldsymbol\eta^\mathrm{T}\boldsymbol\eta-\boldsymbol\eta^\mathrm{T}\Lambda\boldsymbol\eta=\sum\limits_{i = 1}^n \eta_i^2-\sum\limits_{i = 1}^{p+1} \eta_i^2=\sum\limits_{i = p+2}^n \eta_i^2 SSE=ϵTϵ−ϵTHϵ=ηTη−ηTATHAη=ηTη−ηTΛη=i=1∑nηi2−i=1∑p+1ηi2=i=p+2∑nηi2那么 S S E / σ 2 = ∑ i = p + 2 n ( η i / σ ) 2 ∼ χ 2 ( n − p − 1 ) SSE/\sigma^2=\sum\limits_{i = p+2}^n (\eta_i/\sigma)^2\sim\chi^2(n-p-1) SSE/σ2=i=p+2∑n(ηi/σ)2∼χ2(n−p−1)
假设检验和区间估计
- t检验
原假设 H 0 : b i = 0 , i = 1 , 2 , . . . , p H_0:b_i=0,i=1,2,...,p H0:bi=0,i=1,2,...,p
备择假设 H 1 : b i ≠ 0 H_1:b_i\ne0 H1:bi=0
在原假设条件下,由公式(4)(5)可得,
E
(
b
^
i
)
=
b
i
=
0
,
v
a
r
(
b
^
i
)
=
σ
2
c
i
i
,
c
i
i
E(\hat b_i)=b_i=0,var(\hat b_i)=\sigma^2c_{ii},c_{ii}
E(b^i)=bi=0,var(b^i)=σ2cii,cii为
(
X
T
X
)
−
1
(\boldsymbol{X}^\mathrm{T}\boldsymbol{X})^{-1}
(XTX)−1的元素(行列指标从0开始标到p,如果指标仍然从1开始标到p+1,那么应为
c
i
+
1
,
i
+
i
c_{i+1,i+i}
ci+1,i+i,主要原因是在线性回归中系数
b
b
b的下标通常从0开始,与性质4中下标错一位是一个意思)
那么
b
^
i
σ
c
i
i
∼
N
(
0
,
1
)
\frac{\hat{b}_i}{\sigma\sqrt{c_{ii}}}\sim N(0,1)
σciib^i∼N(0,1)由性质(4)
S
S
E
σ
2
∼
χ
2
(
n
−
p
−
1
)
\frac{SSE}{\sigma^2}\sim\chi^2(n-p-1)
σ2SSE∼χ2(n−p−1)因此
b
^
i
/
σ
c
i
i
S
S
E
σ
2
/
(
n
−
p
−
1
)
=
b
^
i
/
c
i
i
S
S
E
/
(
n
−
p
−
1
)
∼
t
(
n
−
p
−
1
)
\frac{\hat{b}_i/\sigma\sqrt{c_{ii}}}{\sqrt{\frac{SSE}{\sigma^2}/(n-p-1)}}=\frac{\hat{b}_i/\sqrt{c_{ii}}}{\sqrt{SSE/(n-p-1)}}\sim t(n-p-1)
σ2SSE/(n−p−1)b^i/σcii=SSE/(n−p−1)b^i/cii∼t(n−p−1)可进行假设检验,如果对进行
b
b
b区间估计,则采用
(
b
^
i
−
b
i
)
/
c
i
i
S
S
E
/
(
n
−
p
−
1
)
∼
t
(
n
−
p
−
1
)
\frac{(\hat{b}_i-b_i)/\sqrt{c_{ii}}}{\sqrt{SSE/(n-p-1)}}\sim t(n-p-1)
SSE/(n−p−1)(b^i−bi)/cii∼t(n−p−1)来进行估计。
2. F检验
原假设
H
0
:
b
1
=
b
2
=
.
.
.
=
b
p
=
0
H_0:b_1=b_2=...=b_p=0
H0:b1=b2=...=bp=0
备择假设
H
1
:
b
1
,
b
2
,
.
.
.
,
b
p
H_1:b_1,b_2,...,b_p
H1:b1,b2,...,bp不全为0
在原假设条件下,
y
=
X
b
+
ϵ
=
y
=
[
1
X
p
]
[
b
0
0
]
+
ϵ
=
b
0
1
+
ϵ
\boldsymbol{y}=\boldsymbol{Xb}+\boldsymbol\epsilon=\boldsymbol{y}=\begin{bmatrix}\boldsymbol{1}& \boldsymbol{X_p}\end{bmatrix}\begin{bmatrix}b_0\\\boldsymbol0\end{bmatrix}+\boldsymbol\epsilon=b_0\boldsymbol{1}+\boldsymbol\epsilon
y=Xb+ϵ=y=[1Xp][b00]+ϵ=b01+ϵ
y
^
=
b
0
1
+
H
ϵ
\boldsymbol{\hat{y}}=b_0\boldsymbol{1}+\boldsymbol{H}\boldsymbol\epsilon
y^=b01+Hϵ
y
‾
=
y
‾
1
=
1
n
1
T
(
b
0
1
+
ϵ
)
1
=
b
0
1
+
1
n
1
T
ϵ
1
\boldsymbol{\overline{y}}=\overline{y}\boldsymbol{1}=\frac{1}{n}\boldsymbol{1}^\mathrm{T}(b_0\boldsymbol{1}+\boldsymbol\epsilon)\boldsymbol{1}=b_0\boldsymbol{1}+\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon\boldsymbol{1}
y=y1=n11T(b01+ϵ)1=b01+n11Tϵ1因此
S
S
T
=
(
ϵ
−
1
n
1
T
ϵ
1
)
T
(
ϵ
−
1
n
1
T
ϵ
1
)
=
ϵ
T
ϵ
−
1
n
(
1
T
ϵ
)
2
SST=(\boldsymbol\epsilon-\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon\boldsymbol{1})^\mathrm{T}(\boldsymbol\epsilon-\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon\boldsymbol{1})=\boldsymbol\epsilon^\mathrm{T}\boldsymbol\epsilon-\frac{1}{n}(\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon)^2
SST=(ϵ−n11Tϵ1)T(ϵ−n11Tϵ1)=ϵTϵ−n1(1Tϵ)2
S
S
R
=
(
H
ϵ
−
1
n
1
T
ϵ
1
)
T
(
H
ϵ
−
1
n
1
T
ϵ
1
)
=
ϵ
T
H
ϵ
−
1
n
(
1
T
ϵ
)
2
SSR=(\boldsymbol{H}\boldsymbol\epsilon-\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon\boldsymbol{1})^\mathrm{T}(\boldsymbol{H}\boldsymbol\epsilon-\frac{1}{n}\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon\boldsymbol{1})=\boldsymbol\epsilon^\mathrm{T}\boldsymbol{H}\boldsymbol\epsilon-\frac{1}{n}(\boldsymbol{1}^\mathrm{T}\boldsymbol\epsilon)^2
SSR=(Hϵ−n11Tϵ1)T(Hϵ−n11Tϵ1)=ϵTHϵ−n1(1Tϵ)2将公式(12)带入上面两式得到:
S
S
T
=
η
T
η
−
1
n
(
1
T
A
η
)
2
=
∑
i
=
1
n
η
i
2
−
1
n
(
[
n
,
0
,
0
,
.
.
.
,
0
]
η
)
2
=
∑
i
=
1
n
η
i
2
−
η
1
2
=
∑
i
=
2
n
η
i
2
SST=\boldsymbol\eta^\mathrm{T}\boldsymbol\eta-\frac{1}{n}(\boldsymbol{1}^\mathrm{T}\boldsymbol{A}\boldsymbol\eta)^2=\sum\limits_{i = 1}^n \eta_i^2-\frac{1}{n}([\sqrt{n} ,0,0,...,0]\boldsymbol\eta)^2=\sum\limits_{i = 1}^n \eta_i^2-\eta_1^2=\sum\limits_{i =2}^n \eta_i^2
SST=ηTη−n1(1TAη)2=i=1∑nηi2−n1([n,0,0,...,0]η)2=i=1∑nηi2−η12=i=2∑nηi2
S
S
R
=
∑
i
=
1
p
+
1
η
i
2
−
η
1
2
=
∑
i
=
2
p
+
1
η
i
2
SSR=\sum\limits_{i = 1}^{p+1} \eta_i^2-\eta_1^2=\sum\limits_{i =2}^{p+1} \eta_i^2
SSR=i=1∑p+1ηi2−η12=i=2∑p+1ηi2那么
S
S
T
σ
2
∼
χ
2
(
n
−
1
)
\frac{SST}{\sigma^2}\sim\chi^2(n-1)
σ2SST∼χ2(n−1)
S
S
R
σ
2
∼
χ
2
(
p
)
\frac{SSR}{\sigma^2}\sim\chi^2(p)
σ2SSR∼χ2(p)因此
S
S
R
σ
2
/
p
S
S
E
σ
2
/
(
n
−
p
−
1
)
=
S
S
R
/
p
S
S
E
/
(
n
−
p
−
1
)
∼
F
(
p
,
n
−
p
−
1
)
\frac{\frac{SSR}{\sigma^2}/p}{\frac{SSE}{\sigma^2}/(n-p-1)}=\frac{SSR/p}{SSE/(n-p-1)}\sim F(p,n-p-1)
σ2SSE/(n−p−1)σ2SSR/p=SSE/(n−p−1)SSR/p∼F(p,n−p−1)从而可进行检验。