[置顶] 用R进行多元线性回归分析建模

5810人阅读 评论(0)

states<-as.data.frame(state.x77[,c('Murder','Population','Illiteracy','Income','Frost')])
cor(states)#查看变量相关系数
Murder Population Illiteracy     Income      Frost
Murder      1.0000000  0.3436428  0.7029752 -0.2300776 -0.5388834
Population  0.3436428  1.0000000  0.1076224  0.2082276 -0.3321525
Illiteracy  0.7029752  0.1076224  1.0000000 -0.4370752 -0.6719470
Income     -0.2300776  0.2082276 -0.4370752  1.0000000  0.2262822
Frost      -0.5388834 -0.3321525 -0.6719470  0.2262822  1.0000000

library(car)

fit<-lm(Murder~Population+Illiteracy+Income+Frost,data = states)
summary(fit)

Call:
lm(formula = Murder ~ Population + Illiteracy + Income + Frost,
data = states)

Residuals:
Min      1Q  Median      3Q     Max
-4.7960 -1.6495 -0.0811  1.4815  7.6210

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.235e+00  3.866e+00   0.319   0.7510
Population  2.237e-04  9.052e-05   2.471   0.0173 *
Illiteracy  4.143e+00  8.744e-01   4.738 2.19e-05 ***
Income      6.442e-05  6.837e-04   0.094   0.9253
Frost       5.813e-04  1.005e-02   0.058   0.9541
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.535 on 45 degrees of freedom
Multiple R-squared:  0.567,	Adjusted R-squared:  0.5285
F-statistic: 14.73 on 4 and 45 DF,  p-value: 9.133e-08

#install.packages('leaps')
library(leaps)
leaps<-regsubsets(Murder~Population+Illiteracy+Income+Frost,data = states,nbest = 4)

zstates<-as.data.frame(scale(states))#scale()标准化
zfit<-lm(Murder~Population+Illiteracy+Income+Frost,data = zstates)
coef(zfit)
(Intercept)    Population    Illiteracy        Income         Frost
-2.054026e-16  2.705095e-01  6.840496e-01  1.072372e-02  8.185407e-03

> confint(fit)
2.5 %       97.5 %
(Intercept) -6.552191e+00 9.0213182149
Population   4.136397e-05 0.0004059867
Illiteracy   2.381799e+00 5.9038743192
Income      -1.312611e-03 0.0014414600
Frost       -1.966781e-02 0.0208304170

qqPlot(fit,labels = row.names(states),id.method = 'identify',simulate = T)

Murder Population Illiteracy Income Frost
Nevada   11.5        590        0.5   5149   188
3.878958
> outlierTest(fit)#或直接这么检测离群点
car包有多个函数，可以判断误差的独立性，线性，同方差性
library(car)
durbinWatsonTest(fit)
crPlots(fit)
ncvTest(fit)

#install.packages('gvlma')
library(gvlma)
gvmodel<-gvlma(fit);summary(gvmodel)

> sqrt(vif(fit))
Population Illiteracy     Income      Frost
1.115922   1.471682   1.160096   1.443103

attach(women)
plot(height,weight)

fit<-lm(weight~height+I(height^2))#含平方项
summary(fit)

Call:
lm(formula = weight ~ height + I(height^2))

Residuals:
Min       1Q   Median       3Q      Max
-0.50941 -0.29611 -0.00941  0.28615  0.59706

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 261.87818   25.19677  10.393 2.36e-07 ***
height       -7.34832    0.77769  -9.449 6.58e-07 ***
I(height^2)   0.08306    0.00598  13.891 9.32e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3841 on 12 degrees of freedom
Multiple R-squared:  0.9995,    Adjusted R-squared:  0.9994
F-statistic: 1.139e+04 on 2 and 12 DF,  p-value: < 2.2e-16

lines(height,fitted(fit))

library(car)

<strong>attach(mtcars)
fit<-lm(mpg~hp+wt+hp:wt)
summary(fit)
Call:
lm(formula = mpg ~ hp + wt + hp:wt)

Residuals:
Min      1Q  Median      3Q     Max
-3.0632 -1.6491 -0.7362  1.4211  4.5513

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 49.80842    3.60516  13.816 5.01e-14 ***
hp          -0.12010    0.02470  -4.863 4.04e-05 ***
wt          -8.21662    1.26971  -6.471 5.20e-07 ***
hp:wt        0.02785    0.00742   3.753 0.000811 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.153 on 28 degrees of freedom
Multiple R-squared:  0.8848,	Adjusted R-squared:  0.8724
F-statistic: 71.66 on 3 and 28 DF,  p-value: 2.981e-13</strong>

0
1

* 以上用户言论只代表其个人观点，不代表CSDN网站的观点或立场
个人资料
• 访问：281155次
• 积分：3609
• 等级：
• 排名：第9059名
• 原创：102篇
• 转载：40篇
• 译文：0篇
• 评论：103条
文章分类
最新评论