主要涉及的多元回归分析(题目见文章末尾图片)
探究多元回归模型的系数、残差与Added variable plot的斜率、残差的关系,发现对应相等
1.在因变量中剔除某个自变量,即得到回归对应的残差序列,作为Y
2.在自变量中剔除某个自变量,也得到回归对应的残差序列,作为x
3.将Y与x做回归拟合,可以发现这个系数与多元回归模型中对应自变量的系数相等。
head(data, 10) #显示前10行
3.1.2 Added-Variable Plots
To get the effect of adding fertility to the model that already includes log(ppgdp), we need to examine the part of the response lifeExpF not explained by log(ppgdp) and the part of the new regressor fertility not explained by log(ppgdp).
library(alr4)
library(basicTrendline)
data<-UN11
x<-log(data$ppgdp)
y<-data$fertility
plot(x,y,col= "blue",xlab="log(ppgdp)",ylab="fertility")
grid() #网格化
fit=lm(y~x)
abline(fit,col="black", lty=1, lwd=2) # lty,lwd图例中线的类型与宽度
summary(fit)
#添加拟合值置信度为95%的置信区间
trendline(x, y, model="line2P", ePos.x = "topleft", summary=TRUE, eDigit=5)
library(alr4)
data<-UN11
head(data, 10) #显示前10行
fertility<-data$fertility
logppgdp<-log(data$ppgdp)
pctUrban<-data$pctUrban
fit1=lm(fertility~logppgdp)
fit2=lm(fertility~pctUrban)
fit3=lm(logppgdp~pctUrban)
fit4=lm(pctUrban~logppgdp)
e1=residuals(fit1)
e2=residuals(fit2)
e3=residuals(fit3)
e4=residuals(fit4)
summary(fit1)
summary(fit2)
summary(fit3)
summary(fit4)
plot(e4,e1,pch=16,xlab="e from pctUrban on logppgdp",ylab="e from fertility on logppgdp")
grid()
fit5=lm(e1~e4)
abline(fit5,col="red")
summary(fit5)
residuals(fit5)[1:10]
plot(e3,e2,pch=16,xlab="e from logppgdp on pctUrban",ylab="e from fertility on pctUrban")
grid()
fit6=lm(e2~e3)
abline(fit6,col="red")
summary(fit6)
library(alr4)
data<-UN11
head(data, 10) #显示前10行
fertility<-data$fertility
logppgdp<-log(data$ppgdp)
pctUrban<-data$pctUrban
fit10=lm(fertility~logppgdp+pctUrban)
summary(fit10)
e10=residuals(fit10)
e10[1:10]
Show that the residuals in the added-variable plot are identical to the residuals from the mean function with both predictors.