多元线性回归
y = [ y 1 y 2 ⋮ y n ] , X = [ 1 x 11 ⋯ x 1 p 1 x 21 ⋯ x 2 p ⋮ ⋮ ⋮ 1 x n 1 ⋯ x n p ] , ϵ = [ ϵ 1 ϵ 2 ⋮ ϵ n ] , β = [ β 0 β 1 ⋮ β p ] y=\left[\begin{array}{c} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{array}\right], X=\left[\begin{array}{cccc} 1 & x_{11} & \cdots & x_{1 p} \\ 1 & x_{21} & \cdots & x_{2 p} \\ \vdots & \vdots & & \vdots \\ 1 & x_{n 1} & \cdots & x_{n p} \end{array}\right], \epsilon=\left[\begin{array}{c} \epsilon_{1} \\ \epsilon_{2} \\ \vdots \\ \epsilon_{n} \end{array}\right], \beta=\left[\begin{array}{c} \beta_{0} \\ \beta_{1} \\ \vdots \\ \beta_{p} \end{array}\right] y=⎣⎢⎢⎢⎡y1y2⋮yn⎦⎥⎥⎥⎤,X=⎣⎢⎢⎢⎡11⋮1x11x21⋮xn1⋯⋯⋯x1px2p⋮xnp⎦⎥⎥⎥⎤,ϵ=⎣⎢⎢⎢⎡ϵ1ϵ2⋮ϵn⎦⎥⎥⎥⎤,β=⎣⎢⎢⎢⎡β0β1⋮βp⎦⎥⎥⎥⎤
y = X β + ε \boldsymbol{y}=\boldsymbol{X} \boldsymbol{\beta}+\varepsilon y=Xβ+ε
X = ( 1 , x 1 , … , x p ) − − n × ( p + 1 ) \boldsymbol{X}=\left(\mathbf{1}, \boldsymbol{x}_{1}, \ldots, \boldsymbol{x}_{p}\right)-- n \times(p+1) X=(1,x1,…,xp)−−n×(p+1)
ε = ( ε 1 , … , ε n ) ′ \varepsilon=\left(\varepsilon_{1}, \ldots, \varepsilon_{n}\right)^{\prime} ε=(ε1,…,εn)′
Gauss-Markov条件:
{
E
(
ε
i
)
=
0
,
i
=
1
,
…
,
n
Cov
(
ε
i
,
ε
j
)
=
0
,
i
≠
j
;
Var
(
ε
i
)
=
σ
2
\left\{\begin{array}{l} E\left(\varepsilon_{i}\right)=0, i=1, \ldots, n \\ \operatorname{Cov}\left(\varepsilon_{i}, \varepsilon_{j}\right)=0, i \neq j ; \quad \operatorname{Var}\left(\varepsilon_{i}\right)=\sigma^{2} \end{array}\right.
{E(εi)=0,i=1,…,nCov(εi,εj)=0,i=j;Var(εi)=σ2
正态性假设:
{
ε
i
∼
N
(
0
,
σ
2
)
,
i
=
1
,
…
,
n
ε
1
,
…
,
ε
n
相互独立
\left\{\begin{array}{l} \varepsilon_{i} \sim N\left(0, \sigma^{2}\right), i=1, \ldots, n \\ \varepsilon_{1}, \ldots, \varepsilon_{n} \quad \text { 相互独立 } \end{array}\right.
{εi∼N(0,σ2),i=1,…,nε1,…,εn 相互独立
LSE
Q ( β ) = ( y − X β ) ′ ( y − X β ) Q(\boldsymbol{\beta})=(\boldsymbol{y}-\boldsymbol{X} \boldsymbol{\beta})^{\prime}(\boldsymbol{y}-\boldsymbol{X} \boldsymbol{\beta}) Q(β)=(y−Xβ)′(y−Xβ)
∂ Q ( β ) ∂ β = − X ′ 2 ( y − X β ) = − 2 X ′ ( y − X β ) = 0 \frac{\partial Q(\boldsymbol{\beta})}{\partial \boldsymbol{\beta}}=-\boldsymbol{X}^{\prime} 2(\boldsymbol{y}-\boldsymbol{X} \boldsymbol{\beta})=-2 \boldsymbol{X}^{\prime}(\boldsymbol{y}-\boldsymbol{X} \boldsymbol{\beta})=0 ∂β∂Q(β)=−X′2(y−Xβ)=−2X′(y−Xβ)=0
β ^ = ( X ′ X ) − 1 X ′ y \hat{\boldsymbol{\beta}}=\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{y} β^=(X′X)−1X′y
y ^ = X β ^ = X ( X ′ X ) − 1 X ′ y = def H y \hat{\boldsymbol{y}}=\boldsymbol{X} \hat{\boldsymbol{\beta}}=\boldsymbol{X}\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{y} \stackrel{\text {def}}{=} \boldsymbol{H} \boldsymbol{y} y^=Xβ^=X(X′X)−1X′y=defHy
H = X ( X ′ X ) − 1 X ′ → H 2 = X ( X ′ X ) − 1 X ′ ∗ X ( X ′ X ) − 1 X ′ = X ( X ′ X ) − 1 X ′ = H H=X(X'X)^{-1}X'\rightarrow H^2=X(X'X)^{-1}X'*X(X'X)^{-1}X'=X(X'X)^{-1}X'=H H=X(X′X)−1X′→H2=X(X′X)−1X′∗X(X′X)−1X′=X(X′X)−1X′=H
( I n − H ) 2 = I n 2 − 2 H I n + H 2 = I n − H (I_n-H)^2=I_n^2-2HI_n+H^2=I_n-H (In−H)2=In2−2HIn+H2=In−H
Resid
e = y − y ^ 为 y 的残差向量,将 y ^ = H y 代入 \boldsymbol{e}=\boldsymbol{y}-\hat{\boldsymbol{y}} \text { 为 } \boldsymbol{y} \text { 的残差向量,将 } \hat{\boldsymbol{y}}=\boldsymbol{H y} \text { 代入 } e=y−y^ 为 y 的残差向量,将 y^=Hy 代入
e = y − H y = ( I − H ) y \boldsymbol{e}=\boldsymbol{y}-\boldsymbol{H} \boldsymbol{y}=(\boldsymbol{I}-\boldsymbol{H}) \boldsymbol{y} e=y−Hy=(I−H)y
D ( e ) = Cov ( e , e ) = Cov ( ( I − H ) y , ( I − H ) y ) = ( I − H ) Cov ( y , y ) ( I − H ) ′ = ( I − H ) σ 2 I ( I − H ) ′ = σ 2 ( I − H ) \begin{aligned} D(\boldsymbol{e}) &=\operatorname{Cov}(\boldsymbol{e}, \boldsymbol{e}) \\ &=\operatorname{Cov}((\boldsymbol{I}-\boldsymbol{H}) \boldsymbol{y},(\boldsymbol{I}-\boldsymbol{H}) \boldsymbol{y}) \\ &=(\boldsymbol{I}-\boldsymbol{H}) \operatorname{Cov}(\boldsymbol{y}, \boldsymbol{y})(\boldsymbol{I}-\boldsymbol{H})^{\prime} \\ &=(\boldsymbol{I}-\boldsymbol{H}) \sigma^{2} \boldsymbol{I}(\boldsymbol{I}-\boldsymbol{H})^{\prime} \\ &=\sigma^{2}(\boldsymbol{I}-\boldsymbol{H}) \end{aligned} D(e)=Cov(e,e)=Cov((I−H)y,(I−H)y)=(I−H)Cov(y,y)(I−H)′=(I−H)σ2I(I−H)′=σ2(I−H)
Var ( e i ) = ( 1 − h i i ) σ 2 , i = 1 , … , n \operatorname{Var}\left(e_{i}\right)=\left(1-h_{i i}\right) \sigma^{2}, i=1, \ldots, n Var(ei)=(1−hii)σ2,i=1,…,n
β
^
\hat{\beta}
β^为无偏估计
E
(
β
^
)
=
E
{
(
X
′
X
)
−
1
X
′
y
}
=
(
X
′
X
)
−
1
X
′
E
y
=
(
X
′
X
)
−
1
X
′
X
β
=
β
\begin{aligned} E(\hat{\boldsymbol{\beta}}) &=E\left\{\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{y}\right\} \\ &=\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} E \boldsymbol{y} \\ &=\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{X} \boldsymbol{\beta}=\boldsymbol{\beta} \end{aligned}
E(β^)=E{(X′X)−1X′y}=(X′X)−1X′Ey=(X′X)−1X′Xβ=β
D ( β ^ ) = σ 2 ( X ′ X ) − 1 D(\hat{\boldsymbol{\beta}})=\sigma^{2}\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} D(β^)=σ2(X′X)−1
exer
P r o o f : σ ^ 2 = 1 n − p − 1 ∑ i = 1 n e i 2 = 1 n − p − 1 e ′ e Proof:\hat{\sigma}^{2}=\frac{1}{n-p-1} \sum_{i=1}^{n} e_{i}^{2}=\frac{1}{n-p-1} \boldsymbol{e}^{\prime} \boldsymbol{e} Proof:σ^2=n−p−11i=1∑nei2=n−p−11e′e
E ( ∑ i = 1 n e i 2 ) = ∑ i = 1 n D ( e i ) E\left(\sum_{i=1}^{n} e_{i}^{2}\right)=\sum_{i=1}^{n} D\left(e_{i}\right) E(i=1∑nei2)=i=1∑nD(ei)
E ( ∑ i = 1 n e i 2 ) = ∑ i = 1 n D ( e i ) = ∑ i = 1 n σ 2 ( 1 − h i i ) = σ 2 ∑ i = 1 n ( 1 − h i i ) = σ 2 ( n − ∑ i = 1 n h i i ) = σ 2 ( n − p − 1 ) E\left(\sum_{i=1}^{n} e_{i}^{2}\right)=\sum_{i=1}^{n} D\left(e_{i}\right)=\sum_{i=1}^{n} \sigma^{2}\left(1-h_{i i}\right)=\sigma^{2} \sum_{i=1}^{n}\left(1-h_{i i}\right)=\sigma^{2}\left(n-\sum_{i=1}^{n} h_{i i}\right)=\sigma^{2}(n-p-1) E(i=1∑nei2)=i=1∑nD(ei)=i=1∑nσ2(1−hii)=σ2i=1∑n(1−hii)=σ2(n−i=1∑nhii)=σ2(n−p−1)
diag ( H ) = ( h 11 , … , h n n ) tr ( H ) = ∑ i = 1 n h i i = tr ( X ( X ′ X ) − 1 X ′ ) = tr ( ( X ′ X ) − 1 X ′ X ) = p + 1 \begin{array}{l} \operatorname{diag}(\boldsymbol{H})=\left(h_{11}, \ldots, h_{n n}\right) \\ \operatorname{tr}(H)=\sum_{i=1}^{n} h_{i i}=\operatorname{tr}\left(\boldsymbol{X}\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime}\right)=\operatorname{tr}\left(\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1} \boldsymbol{X}^{\prime} \boldsymbol{X}\right)=p+1 \end{array} diag(H)=(h11,…,hnn)tr(H)=∑i=1nhii=tr(X(X′X)−1X′)=tr((X′X)−1X′X)=p+1
计 算 Cov ( e , β ^ ) → σ ^ 2 与 β ^ 的 独 立 性 计算\operatorname{Cov}(\boldsymbol{e}, \hat{\boldsymbol{\beta}})\rightarrow \hat{\sigma}^{2} 与\hat{\beta}的独立性 计算Cov(e,β^)→σ^2与β^的独立性
cov ( β ^ , e ) = cov ( ( X T X ) − 1 X T y , ( I − H ) y ) = ( X T X ) − 1 X T cov ( y , y ) ( I − H ) σ 2 ( X T X ) − 1 X T ( I − H ) = 0 \begin{array}{l} \operatorname{cov}(\hat{\beta}, e)=\operatorname{cov}\left(\left(X^{T} X\right)^{-1} X^{T} y,(I-H) y\right)=\left(X^{T} X\right)^{-1} X^{T} \operatorname{cov}(y, y)(I-H) \\ \sigma^{2}\left(X^{T} X\right)^{-1} X^{T}(I-H)=0 \end{array} cov(β^,e)=cov((XTX)−1XTy,(I−H)y)=(XTX)−1XTcov(y,y)(I−H)σ2(XTX)−1XT(I−H)=0
这是因为最小二乘法 ( I − H ) X = 0 可以推出 X T ( I − H ) = 0 \text { 这是因为最小二乘法 }(I-H) X=0 \text { 可以推出 } X^{T}(I-H)=0 这是因为最小二乘法 (I−H)X=0 可以推出 XT(I−H)=0
在 正 态 分 布 假 定 下 , 对 一 元 线 性 回 归 模 型 , 构 造 假 设 检 验 H 0 : 2 β 0 = β 1 v.s. H 1 : 2 β 0 ≠ β 1 的检验统计量 \begin{aligned} &在正态分布假定下,对一元线性回归模型,构造假设检验 H_{0}: 2 \beta_{0}=\beta_{1} \text { v.s. } H_{1}: 2 \beta_{0} \neq \beta_{1} \text { 的检验统计量} \end{aligned} 在正态分布假定下,对一元线性回归模型,构造假设检验H0:2β0=β1 v.s. H1:2β0=β1 的检验统计量
在 G − M G-M G−M假设下,最小二乘估计 β ^ \hat{\boldsymbol{\beta}} β^与残差向量 e e e不相美,即 C o v ( β ^ , e ) = 0 Cov(\hat{\boldsymbol{\beta}}, \boldsymbol{e})=\mathbf{0} Cov(β^,e)=0 ,进一步,在正态假设下, β ^ \hat{\boldsymbol{\beta}} β^与e独立,从而 β ^ \hat{\boldsymbol{\beta}} β^与 S S E = e ′ e = ∥ e ∥ 2 SSE =\mathbf{e}^{\prime} \mathbf{e}=\|\mathbf{e}\|^{2} SSE=e′e=∥e∥2 独立
当
y
∼
N
(
X
β
,
σ
2
I
n
)
y \sim N\left(\boldsymbol{X} \boldsymbol{\beta}, \sigma^{2} \boldsymbol{I}_{n}\right)
y∼N(Xβ,σ2In) 时,则
(
1
)
β
^
∼
N
(
β
,
σ
2
(
X
′
X
)
−
1
)
(1) \hat{\boldsymbol{\beta}} \sim N\left(\boldsymbol{\beta}, \sigma^{2}\left(\boldsymbol{X}^{\prime} \boldsymbol{X}\right)^{-1}\right)
(1)β^∼N(β,σ2(X′X)−1)
( 2 ) SSE / σ 2 ∼ χ 2 ( n − p − 1 ) (2) \operatorname{SSE} / \sigma^{2} \sim \chi^{2}(n-p-1) (2)SSE/σ2∼χ2(n−p−1)