一、一元回归分析
读入数据,观察结构
京东股价信息。
a<-read.csv(choose.files())
> head(a)
name time opening_price closing_price low_price high_price volume
1 JD 3-Jan-17 25.95 25.82 25.64 26.11 8275300
2 JD 4-Jan-17 26.05 25.85 25.58 26.08 7862800
3 JD 5-Jan-17 26.15 26.30 26.05 26.80 10205600
4 JD 6-Jan-17 26.30 26.27 25.92 26.41 6234300
5 JD 9-Jan-17 26.64 26.26 26.14 26.95 8071500
6 JD 10-Jan-17 26.30 26.90 26.25 27.10 20417400
> dim(a)
[1] 71 7
> colnames(a)
[1] "name" "time" "opening_price" "closing_price" "low_price" "high_price" "volume"
绘制散点图+回归直线
> qplot(opening_price,closing_price,data=a)+geom_smooth(method='lm')
计算回归直线
> a.lm<-lm(closing_price~opening_price,data=a)
> a.lm
Call:
lm(formula = closing_price ~ opening_price, data = a)
Coefficients:
(Intercept) opening_price
0.9176 0.9697
> summary(a.lm) #查看详细结果
Call:
lm(formula = closing_price ~ opening_price, data = a)
Residuals:
Min 1Q Median 3Q Max
-1.51096 -0.24039 -0.02012 0.24051 0.74053
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.91763 0.70498 1.302 0.197
opening_price 0.96968 0.02364 41.020 <2e-16
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3815 on 69 degrees of freedom
Multiple R-squared: 0.9606, Adjusted R-squared: 0.96
F-statistic: 1683 on 1 and 69 DF, p-value: < 2.2e-16
根据上述结果,可以得到回归方程:
closing_price = 0.9176 + 0.9697opening_price
计算预测值并做与实际观察值的散点图。
> a.pred<-predict(a.lm)
> qplot(a$closing_price,a.pred)+geom_smooth(method='lm')
计算残差并做直方图。
> a.res<-**residuals**(a.lm)
> qplot(a.res,binwidth=0.1,color=I('black'),fill=I('green'))
二、多元回归分析
1、读入数据
公共自行车使用数据
> bike<-read.csv(file.choose())
> head(bike)
instant dteday season yr mnth holiday weekday workingday weathersit temp atemp hum windspeed casual
1 1 2011-01-01 1 0 1 0 6 0 2 0.344167 0.363625 0.805833 0.1604460 331
2 2 2011-01-02 1 0 1 0 0 0 2 0.363478 0.353739 0.696087 0.2485390 131
3 3 2011-01-03 1 0 1 0 1 1 1 0.196364 0.189405 0.437273 0.2483090 120
4 4 2011-01-04 1 0 1 0 2 1