原文地址:线性回归
4.1 一元线性模型
一元线性模型定义如下
y i = β 0 + β 1 x i + ϵ i , i = 1 , . . . , n y_i=\beta_0+\beta_1x_i+\epsilon_i,i=1,...,n yi=β0+β1xi+ϵi,i=1,...,n
其中
- ϵ i \epsilon_i ϵi为随机变量(需要一些假设)
- x i x_i xi为固定值(独立/预测变量)
- y i y_i yi为随机变量(依赖/响应变量)
- β 0 \beta_0 β0为截距
- β 1 \beta_1 β1为斜率
4.1.1 最小二乘估计
选择参数 β 0 , β 1 \beta_0,\beta_1 β0,β1最小化以下函数Q:
Q ( β 0 , β 1 ) = ∑ i = 1 n ( y i − β 0 − β 1 x i ) Q(\beta_0,\beta_1)=\sum_{i=1}^n(y_i-\beta_0-\beta_1x_i) Q(β0,β1)=i=1∑n(yi−β0−β1xi)
函数取最小值时,参数 β 0 ^ , β 1 ^ \hat{\beta_0},\hat{\beta_1} β0^,β1^满足:
∂ Q ∂ β 0 = − 2 ∑ i = 1 n ( y i − β ^ 0 − β ^ 1 x i ) = 0 \frac{\partial Q}{\partial \beta_0}=-2\sum_{i=1}^n(y_i-\hat{\beta}_0-\hat{\beta}_1x_i)=0 ∂β0∂Q=−2i=1∑n(yi−β^0−β^1xi)=0
∂ Q ∂ β 1 = − 2 ∑ i = 1 n ( y i − β ^ 0 − β ^ 1 x i ) x i = 0 \frac{\partial Q}{\partial \beta_1}=-2\sum_{i=1}^n(y_i-\hat{\beta}_0-\hat{\beta}_1 x_i)x_i=0 ∂β1∂Q=−2i=1∑n(yi−β^0−β^1xi)xi=0
求解得出:
β 1 ^ = ∑ i = 1 n ( y i − y ‾ ) x i ∑ i = 1 n ( x i − x ‾ ) x i \hat{\beta_1}=\frac{\sum_{i=1}^n(y_i-\overline{y})x_i}{\sum_{i=1}^n(x_i-\overline{x})x_i} β1^=∑i=1n(xi−x)xi∑i=1n(yi−y)xi
β 0 = y ‾ − β 1 ^ x ‾ \beta_0=\overline{y}-\hat{\beta_1}\overline{x} β0=y−β1^x
我们另外做出如下定义:
ζ x x = ∑ i = 1 n ( x i − x ‾ ) 2 \zeta_{xx}=\sum_{i=1}^n(x_i-\overline{x})^2 ζxx=i=1∑n(xi−x)2
ζ y y ∑ i = 1 n ( y i − y ‾ ) 2 \zeta_{yy}\sum_{i=1}^n(y_i-\overline{y})^2 ζyyi=1∑n(yi−y)2
ζ x y ∑ ( x i − x ‾ ) ( y i − y ‾ ) \zeta_{xy}\sum(x_i-\overline{x})(y_i-\overline{y}) ζxy∑(xi−x)(yi−y)
则可以得出:
β 1 ^ = ∑ i = 1 n ( y i − y ‾ ) ( x i − x ‾ ) ∑ i = 1 n ( x i − x ‾ ) ( x i − x ‾ ) = ζ x y ζ x x = 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) y i \hat{\beta_1}=\frac{\sum_{i=1}^n(y_i-\overline{y})(x_i-\overline{x})}{\sum_{i=1}^n(x_i-\overline{x})(x_i-\overline{x})}=\frac{\zeta_{xy}}{\zeta_{xx}}=\frac{1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})y_i β1^=∑i=1n(xi−x)(xi−x)∑i=1n(yi−y)(xi−x)=ζxxζxy=ζxx1i=1∑n(xi−x)yi
由此得出的回归函数为:
y ^ = β 0 ^ + β 1 ^ x \hat{y}=\hat{\beta_0}+\hat{\beta_1}x y^=β0^+β1^x
4.1.2 期望和方差
假设 A1: E [ ϵ i ] = 0 , i = 1 , . . . , n E[\epsilon_i]=0,i=1,...,n E[ϵi]=0,i=1,...,n.
定理4.1: 假设A1成立,则 β 0 ^ , β 1 ^ \hat{\beta_0},{\hat{\beta_1}} β0^,β1^是 β 0 , β 1 \beta_0,\beta_1 β0,β1的无偏估计。
证明:
E [ β 1 ^ ] = 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) E [ y i ] E[\hat{\beta_1}]=\frac{1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})E[y_i] E[β1^]=ζxx1i=1∑n(xi−x)E[yi]
= 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) ( β 0 + β 1 x i ) =\frac{1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})(\beta_0+\beta_1x_i) =ζxx1i=1∑n(xi−x)(β0+β1xi)
= β 0 ζ x x ∑ i = 1 n ( x i − x ‾ ) + β 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) x i =\frac{\beta_0}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})+\frac{\beta_1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})x_i =ζxxβ0i=1∑n(xi−x)+ζxxβ1i=1∑n(xi−x)xi
= β 0 ζ x x ( ∑ i = 1 n x i − ∑ i = 1 n x ‾ ) + β 1 ζ x x ( x i − x ‾ ) x i =\frac{\beta_0}{\zeta_{xx}}(\sum_{i=1}^nx_i-\sum_{i=1}^n\overline{x}) +\frac{\beta_1}{\zeta_{xx}}(x_i-\overline{x})x_i =ζxxβ0(i=1∑nxi−i=1∑nx)+ζxxβ1(xi−x)xi
= β 0 ζ x x ( n x ‾ − n x ‾ ) + β 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) x i =\frac{\beta_0}{\zeta_{xx}}(n\overline{x}-n\overline{x})+\frac{\beta_1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})x_i =ζxxβ0(nx−nx)+ζxxβ1i=1∑n(xi−x)xi
= β 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) x i =\frac{\beta_1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})x_i =ζxxβ1i=1∑n(xi−x)xi
= β 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) x i − β 1 x ‾ ζ x x ∑ i = 1 n ( x i − x ‾ ) =\frac{\beta_1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})x_i - \frac{\beta1\overline{x}}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x}) =ζxxβ1i=1∑n(xi−x)xi−ζxxβ1xi=1∑n(xi−x)
= β 1 ζ x x ∑ i = 1 n ( x i − x ‾ ) ( x i − x ‾ ) =\frac{\beta_1}{\zeta_{xx}}\sum_{i=1}^n(x_i-\overline{x})(x_i-\overline{x}) =ζxxβ1i=1∑n(xi−x)(xi−x)
= β 1 =\beta_1 =β1
E [ β 0 ^ ] = E [ y ‾ − β 1 ^ x ‾ ] = E [ y ‾ ] − β 1 x ‾ = β 0 + β 1 x ‾ − β 1 x ‾ = β 0 E[\hat{\beta_0}]=E[\overline{y}-\hat{\beta_1}\overline{x}]=E[\overline{y}]-\beta_1\overline{x}=\beta_0+\beta_1\overline{x}-\beta_1\overline{x}=\beta_0 E[β0^]=E[y−β1^x]=E[y]−β1x=β0+β1x−β1x=β0
假设 A2: C o v ( ϵ i , ϵ j ) = σ 2 { i = j } , C o v ( ϵ i , ϵ j ) = 0 { i ≠ j } Cov(\epsilon_i,\epsilon_j)=\sigma^2\{i=j\},Cov(\epsilon_i,\epsilon_j)=0\{i\neq j\} Cov(ϵi,ϵj)=σ2{i=j},Cov(ϵi,ϵj)=0{i=j}
定理4.2 : 假设A2成立,则有
V a r [ β ^ 0 ] = ( 1 n + x ‾ 2 ζ x x ) σ 2 Var[\hat{\beta}_0]=(\frac{1}{n}+\frac{\overline{x}^2}{\zeta_{xx}})\sigma^2 Var[β^0]=(n1+ζxxx2)σ2
V a r [ β 1 ^ ] = σ 2 ζ x x Var[\hat{\beta_1}]=\frac{\sigma^2}{\zeta_{xx}} Var[β1^]=ζxxσ2
C o v ( β 0 ^ , β 1 ^ ) = − x ‾ ζ x x σ 2 Cov(\hat{\beta_0},\hat{\beta_1})=\frac{-\overline{x}}{\zeta_{xx}}\sigma^2 Cov(β0^,β1^)=ζxx−xσ2
证明后补
4.1.3 误差项的方差估计
上文的A2假设中,方差 σ 2 \sigma^2 σ2是一般是未知的,下面的理论给出 σ 2 \sigma^2 σ2的无偏估计。
定义4.1: 我们将误差的平方和(SSE)记录为:
S e 2 = ∑ i = 1 n ( y i − β 0 ^ − β 1 ^ x i ) 2 S_e^2=\sum_{i=1}^n(y_i-\hat{\beta_0}-\hat{\beta_1}x_i)^2 Se2=i=1∑n(yi−β0^−β1^xi)2
定理 4.3:首先我们定义
σ ^ 2 = Q ( β ^ 0 , β ^ 1 ) n − 2 = S e 2 n − 2 \hat{\sigma}^2=\frac{Q(\hat{\beta}_0,\hat{\beta}_1)}{n-2}=\frac{S_e^2}{n-2} σ^2=n−2Q(β^0,β^1)=n−2Se2
假设A1,A2成立,则有 E [ σ ^ 2 ] = σ 2 E[\hat{\sigma}^2]=\sigma^2 E[σ^2]=σ2成立
证明后补
4.1.4 抽样分布定理
假设B: ϵ i ∼ i i d N ( 0 , σ 2 ) , i = 1 , . . . , n \epsilon_i \sim^{iid} N(0, \sigma^2),i=1,...,n ϵi∼iidN(0,σ2),i=1,...,n
iid:independently identically distribution,表示独立同分布
定理4.4: 假设B成立,则有
- β ^ 0 ∼ N ( β 0 , ( 1 n + x ‾ ζ x x σ 2 ) \hat{\beta}_0\sim N(\beta_0,(\frac{1}{n}+\frac{\overline{x}}{\zeta_{xx}}\sigma^2) β^0∼N(β0,(n1+ζxxxσ2)
- β ^ 1 ∼ N ( β 1 , σ 2 ζ x x ) \hat{\beta}_1\sim N(\beta_1,\frac{\sigma^2}{\zeta{xx}}) β^1∼N(β1,ζxxσ2)
- ( n − 2 ) σ ^ 2 σ 2 = S e 2 σ 2 ∼ χ 2 ( n − 2 ) \frac{(n-2)\hat{\sigma}^2}{\sigma^2}=\frac{S_e^2}{\sigma^2}\sim \chi^2(n-2) σ2(n−2)σ^2=σ2Se2∼χ2(n−2)
- σ ^ 2 \hat{\sigma}^2 σ^2不依赖于 ( β 0 ^ , β 1 ^ ) (\hat{\beta_0},\hat{\beta_1}) (β0^,β1^)
4.1.5 置信区间与假设检验
给定 σ \sigma σ的情况下,我们可以使用 β 1 ^ − β 1 σ ζ x x ∼ N ( 0 , 1 ) \frac{\hat{\beta_1}-\beta_1}{\sigma\sqrt{\zeta_{xx}}}\sim N(0,1) σζxxβ1^−β1∼N(0,1)确定检验和置信区间。 β 1 \beta_1 β1的 100 ( 1 − α ) % 100(1-\alpha)\% 100(1−α)%置信区间为 β 1 ^ ± u 1 − α / 2 σ / ζ x x \hat{\beta_1}\pm u_{1-\alpha/2}\sigma/\sqrt{\zeta_{xx}} β1^±u1−α/2σ/ζxx
更一般情况是我们不知道 σ \sigma σ只要满足 n ≥ 3 n\ge3 n≥3,则有
β 1 ^ − β 1 σ ^ / ζ x x ∼ t ( n − 2 ) \frac{\hat{\beta_1}-\beta_1}{\hat{\sigma}/\sqrt{\zeta_{xx}}}\sim t(n-2) σ^/ζxxβ1^−β1∼t(n−2)
β 1 \beta_1 β1的 100 ( 1 − α ) % 100(1-\alpha)\% 100(1−α)%置信区间为 β 1 ± t 1 − α / 2 ( n − 2 ) σ ^ / ζ x x \beta_1\pm t_{1-\alpha/2}(n-2)\hat{\sigma}/\sqrt{\zeta_{xx}} β1±t1−α/2(n−2)σ^/ζxx
β 0 \beta_0 β0的预测推到,可以使用:
β 0 ^ − β 0 σ 1 / n + x ‾ 2 / ζ x x ∼ N ( 0 , 1 ) \frac{\hat{\beta_0}-\beta_0}{\sigma\sqrt{1/n+\overline{x}^2/\zeta_{xx}}} \sim N(0,1) σ1/n+x2/ζxxβ0^−β0∼N(0,1)
β 0 ^ − β 0 σ ^ 1 / n + x ‾ 2 / ζ x x ∼ t ( n − 2 ) \frac{\hat{\beta_0}-\beta_0}{\hat{\sigma}\sqrt{1/n+\overline{x}^2/\zeta_{xx}}} \sim t(n-2) σ^1/n+x2/ζxxβ0^−β0∼t(n−2)
σ 2 \sigma^2 σ2的 100 ( 1 − α ) % 100(1-\alpha)\% 100(1−α)%的置信区间为:
[ ( n − 2 ) σ ^ 2 χ 1 − α / 2 2 ( n − 2 ) , ( n − 2 ) σ ^ 2 χ α / 2 2 ( n − 2 ) ] = [ S e 2 χ 1 − α / 2 2 ( n − 2 ) , S e 2 χ α / 2 2 ( n − 2 ) ] [\frac{(n-2)\hat{\sigma}^2}{\chi^2_{1-\alpha/2}(n-2)}, \frac{(n-2)\hat{\sigma}^2}{\chi^2_{\alpha/2}(n-2)}]=[\frac{S_e^2}{\chi^2_{1-\alpha/2}(n-2)},\frac{S_e^2}{\chi^2_{\alpha/2}(n-2)}] [χ1−α/22(n−2)(n−2)σ^2,χα/22(n−2)(n−2)σ^2]=[χ1−α/22(n−2)Se2,χα/22(n−2)Se2]
4.1.6 案例分析1
一家空调商家存在空调组装问题,原因是连接杆无法满足成品重量规格。很多的连接杆在被最后加工后,由于超重问题而无法使用。为了降低成本,该商家的质量控制部门希望量化成品连接杆重量 y y y与毛坯铸件重量 x x x之间的关系,在进行最后加工(成本较高)之前,将超重的毛坯铸件剔除。
数据显示如下:
rod = data.frame(
id = seq(1:25),
rough_weight = c(2.745, 2.700, 2.690, 2.680, 2.675,
2.670, 2.665, 2.660, 2.655, 2.655,
2.650, 2.650, 2.645, 2.635, 2.630,
2.625, 2.625, 2.620, 2.615, 2.615,
2.615, 2.610, 2.590, 2.590, 2.565),
finished_weight = c(2.080, 2.045, 2.050, 2.005, 2.035,
2.035, 2.020, 2.005, 2.010, 2.000,
2.000, 2.005, 2.015, 1.990, 1.990,
1.995, 1.985, 1.970, 1.985, 1.990,
1.995, 1.990, 1.975, 1.995, 1.955)
)
knitr::kable(rod, caption = "rough weight vs. finished weight")
考虑采用线性模型拟合二者重量的关系:
y i = β 0 + β 1 x i + ϵ i , ϵ i ∼ i i d N ( 0 , σ 2 ) y_i=\beta_0+\beta_1x_i+\epsilon_i,\epsilon_i\sim^{iid}N(0,\sigma^2) yi=β0+β1xi+ϵi,ϵi∼iidN(0,σ2)
由观察数据得出 x ‾ = 2.643 , y ‾ = 2.0048 , ζ x x = 0.0367 , ζ x y = 0.023565 , σ ^ = 0.0113 \overline{x}=2.643,\overline{y}=2.0048,\zeta_{xx}=0.0367,\zeta_{xy}=0.023565,\hat{\sigma}=0.0113 x=2.643,y=2.0048,ζxx=0.0367,ζxy=0.023565,σ^=0.0113.
最小二乘估计为:
β 1 = ζ x y ζ x x = 0.023565 0.0367 = 0.642 , β 0 = y ‾ − β 1 ^ x ‾ = 0.308 \beta_1=\frac{\zeta_{xy}}{\zeta_{xx}}=\frac{0.023565}{0.0367}=0.642,\beta_0=\overline{y}-\hat{\beta_1}\overline{x}=0.308 β1=ζxxζxy=0.03670.023565=0.642,β0=y−β1^x=0.308
回归函数为 y ^ = 0.308 + 0.642 x \hat{y}=0.308+0.642x y^=0.308+0.642x. 图像如下图蓝色直线:
4.1.7 拟合的评估
为了帮助评估拟合的质量,我们将广泛使用残差的概念,即观察值与拟合值之间的差:
ϵ i ^ = y i − β 0 ^ − β 1 ^ x i , i = 1 , . . . , n . \hat{\epsilon_i}=y_i-\hat{\beta_0}-\hat{\beta_1}x_i,i=1,...,n. ϵi^=yi−β0^−β1^xi,i=1,...,n.
以图形方式检查残差最有用。残差与 x 值的关系可能揭示系统失配或数据不符合拟合模型的方式。理想情况下,残差应该与 x 值无关,绘图应看起来像是水平模糊。案例1的残差如下图所示。
par(mar=c(4,4,2,1))
plot(lm.rod$fitted.values,lm.rod$residuals,"p",
xlab="Fitted values",ylab = "Residuals")
标准化残差如下图所示
par(mar=c(4,4,2,1))
plot(lm.rod$fitted.values,rstandard(lm.rod),"p",
xlab="Fitted values",ylab = "Standardized Residuals")
abline(h=c(-2,2),lty=c(5,5))
4.1.8 预测
有关 E [ y n + 1 ] E[y_{n+1}] E[yn+1]的预测
给定 x n + 1 x_{n+1} xn+1,我们想要估计 y n + 1 y_{n+1} yn+1的期望值,比如, E [ y n + 1 ] = β 0 + β 1 x n + 1 E[y_{n+1}]=\beta_0+\beta_1x_{n+1} E[yn+1]=β0+β1xn+1。一个自然的无偏估计为 y ^ n + 1 = β 0 ^ + β 1 ^ x n + 1 \hat{y}_{n+1}=\hat{\beta_0}+\hat{\beta_1}x_{n+1} y^n+1=β0^+β1^xn+1. 由上文的定理4.3,我们知道:
V a r [ y ^ n + 1 ] = ( 1 n + ( x n + 1 − x ‾ ) 2 ζ x x ) σ 2 Var[\hat{y}_{n+1}]=(\frac{1}{n}+\frac{(x_{n+1}-\overline{x})^2}{\zeta_{xx}})\sigma^2 Var[y^n+1]=(n1+ζxx(xn+1−x)2)σ2
假设B成立,由定理4.4得出:
y ^ n + 1 ∼ N ( β 0 + β 1 x n + 1 , ( 1 / n + ( x n + 1 − x ‾ ) 2 ) / σ 2 ) \hat{y}_{n+1}\sim N(\beta_0+\beta_1 x_{n+1},(1/n+(x_{n+1}-\overline{x})^2)/\sigma^2) y^n+1∼N(β0+β1xn+1,(1/n+(xn+1−x)2)/σ2)
y ^ n + 1 − E [ y n + 1 ] σ ^ 1 / n + ( x n + 1 − x ‾ ) 2 / ζ x x ∼ t ( n − 2 ) \frac{\hat{y}_{n+1}-E[y_{n+1}]}{\hat{\sigma}\sqrt{1/n+(x_{n+1}-\overline{x})^2/\zeta_{xx}}}\sim t(n-2) σ^1/n+(xn+1−x)2/ζxxy^n+1−E[yn+1]∼t(n−2)
因此得出下列结论
定理4.5: 假设B成立,则有
y ^ n + 1 = β 0 ^ + β 1 ^ x n + 1 ∼ N ( β 0 + β 1 x n + 1 , [ 1 / n + ( x n + 1 − x ‾ ) 2 / ζ x x ] σ 2 ) \hat{y}_{n+1}=\hat{\beta_0}+\hat{\beta_1}x_{n+1}\sim N(\beta_0+\beta_1 x_{n+1},[1/n+(x_{n+1}-\overline{x})^2/\zeta_{xx}]\sigma^2) y^n+1=β0^+β1^xn+1∼N(β0+β1xn+1,[1/n+(xn+1−x)2/ζxx]σ2)。
E [ y n + 1 ] = β 0 + β 1 x n + 1 E[y_{n+1}]=\beta_0+\beta_1 x_{n+1} E[yn+1]=β0+β1xn+1的100(1-\alpha)%$置信区间为
y ^ n + 1 ± t 1 − α / 2 ( n − 2 ) σ ^ 1 n + ( x n + 1 − x ‾ ) 2 ζ x x \hat{y}_{n+1}\pm t_{1-\alpha/2}(n-2)\hat{\sigma}\sqrt{\frac{1}{n}+\frac{(x_{n+1}-\overline{x})^2}{\zeta_{xx}}} y^n+1±t1−α/2(n−2)σ^n1+ζxx(xn+1−x)2
有关未来观察值 y n + 1 y_{n+1} yn+1的预测
现在我们来给出一个未来观察值 y n + 1 y_{n+1} yn+1的预测区间,而不是期望值 E [ y n + 1 ] E[y_{n+1}] E[yn+1]的区间。需要注意的是 y n + 1 y_{n+1} yn+1不再是一个固定的参数。一个预测区间指的是一个以指定概率包含 y n + 1 y_{n+1} yn+1的区间。我们考虑 y n + 1 − y ^ n + 1 y_{n+1}-\hat{y}_{n+1} yn+1−y^n+1,如果A1假设成立(即 E [ ϵ i ] = 0 , i = 1 , . . , n E[\epsilon_i]=0,i=1,..,n E[ϵi]=0,i=1,..,n),则有
E [ y n + 1 − y ^ n + 1 ] = E [ y n + 1 ] − E [ y ^ n + 1 ] E[y_{n+1}-\hat{y}_{n+1}]=E[y_{n+1}]-E[\hat{y}_{n+1}] E[yn+1−y^n+1]=E[yn+1]−E[y^n+1]
= E [ y n + 1 ] − E [ β ^ 0 + β ^ 1 x n + 1 ] =E[y_{n+1}]-E[\hat{\beta}_0+\hat{\beta}_1x_{n+1}] =E[yn+1]−E[β^0+β^1xn+1]
= E [ y n + 1 ] − [ E [ β 0 ^ ] + E [ β ^ 1 x n + 1 ] ] =E[y_{n+1}]-[E[\hat{\beta_0}] + E[\hat{\beta}_1x_{n+1}]] =E[yn+1]−[E[β0^]+E[β^1xn+1]]
= E [ y n + 1 ] − [ β 0 + β 1 x n + 1 ] =E[y_{n+1}] - [\beta_0 + \beta_1 x_{n+1}] =E[yn+1]−[β0+β1xn+1]
= E [ β 0 + β 1 x n + 1 + ϵ n + 1 ] − [ β 0 + β 1 x n + 1 ] =E[\beta_0+\beta_1 x_{n+1}+\epsilon_{n+1}] - [\beta_0+\beta_1 x_{n+1}] =E[β0+β1xn+1+ϵn+1]−[β0+β1xn+1]
= β 0 + β 1 x n + 1 + E [ ϵ n + 1 ] − β 0 − β 1 x n + 1 =\beta_0+\beta_1 x_{n+1}+E[\epsilon_{n+1}]-\beta_0-\beta_1 x_{n+1} =β0+β1xn+1+E[ϵn+1]−β0−β1xn+1
= 0 =0 =0
如果A2假设成立(即 C o v ( ϵ i , ϵ j ) = σ 2 { i = j } , = 0 { i ≠ j } Cov(\epsilon_i,\epsilon_j)=\sigma^2\{i=j\},=0\{i\ne j\} Cov(ϵi,ϵj)=σ2{i=j},=0{i=j},则有
V a r [ y n + 1 − y ^ n + 1 ] = V a r [ y n + 1 ] + V a r [ y ^ n + 1 ] = ( 1 + 1 n + ( x n + 1 − x ‾ ) 2 ζ x x ) σ 2 Var[y_{n+1}-\hat{y}_{n+1}]=Var[y_{n+1}]+Var[\hat{y}_{n+1}]=(1+\frac{1}{n}+\frac{(x_{n+1}-\overline{x})^2}{\zeta_{xx}})\sigma^2 Var[yn+1−y^n+1]=Var[yn+1]+Var[y^n+1]=(1+n1+ζxx(xn+1−x)2)σ2
如果假设B成立,则 y n + 1 − y ^ n + 1 y_{n+1}-\hat{y}_{n+1} yn+1−y^n+1满足正太分布。
定理4.6: 假设B成立,定义 y n + 1 = β 0 + β 1 x n + 1 + ϵ y_{n+1}=\beta_0+\beta_1 x_{n+1}+\epsilon yn+1=β0+β1xn+1+ϵ,其中 ϵ ∼ N ( 0 , σ 2 ) \epsilon \sim N(0,\sigma^2) ϵ∼N(0,σ2)
则y的 100 ( 1 − α ) % 100(1-\alpha)\% 100(1−α)%的置信区间为
y ^ n + 1 ± t 1 − α / 2 ( n − 2 ) σ ^ 1 + 1 n + ( x n + 1 − x ‾ ) 2 ζ x x \hat{y}_{n+1}\pm t_{1-\alpha/2}(n-2)\hat{\sigma}\sqrt{1+\frac{1}{n}+\frac{(x_{n+1}-\overline{x})^2}{\zeta_{xx}}} y^n+1±t1−α/2(n−2)σ^1+n1+ζxx(xn+1−x)2
4.1.9 控制
如何控制 y n + 1 y_{n+1} yn+1的取值范围呢?
再次考虑案例1。在进行最终(且昂贵)的加工过程之前,可以丢弃可能产生过重杆的铸件。公司的质量控制部门想以不低于0.95的概率生产重量不超过2.05的杆,如果选择铸件的重量。
现在我们想要 y n + 1 ≤ y 0 = 2.05 y_{n+1}\le y_0=2.05 yn+1≤y0=2.05以 1 − α 1-\alpha 1−α成立。类似于定理4.6,我们可以为 y n + 1 y_{n+1} yn+1构造单边置信区间:
( − ∞ , y ^ n + 1 + t 1 − α ( n − 2 ) σ ^ 1 + 1 n + ( x n + 1 − x ‾ ) 2 ζ x x ] (-\infin,\hat{y}_{n+1}+t_{1-\alpha}(n-2)\hat{\sigma}\sqrt{1+\frac{1}{n}+\frac{(x_{n+1}-\overline{x})^2}{\zeta_{xx}}}] (−∞,y^n+1+t1−α(n−2)σ^1+n1+ζxx(xn+1−x)2]
得出:
β ^ 0 + β ^ 1 x n + 1 + t 1 − α ( n − 2 ) σ ^ 1 + 1 n + ( x n + 1 − x ‾ ) 2 ζ x x ≤ y 0 \hat{\beta}_0+\hat{\beta}_1 x_{n+1} + t_{1-\alpha}(n-2)\hat{\sigma}\sqrt{1+\frac{1}{n}+\frac{(x_{n+1}-\overline{x})^2}{\zeta_{xx}}}\le y_0 β^0+β^1xn+1+t1−α(n−2)σ^1+n1+ζxx(xn+1−x)2≤y0