多元回归分析

导入程序包

需要从程序包里面加载程序包,然后选择MVP
在这里插入图片描述

导入数据

> data<-table.b11
> data
   Clarity Aroma Body Flavor Oakiness Quality Region
1      1.0   3.3  2.8    3.1      4.1     9.8      1
2      1.0   4.4  4.9    3.5      3.9    12.6      1
3      1.0   3.9  5.3    4.8      4.7    11.9      1
4      1.0   3.9  2.6    3.1      3.6    11.1      1
5      1.0   5.6  5.1    5.5      5.1    13.3      1
6      1.0   4.6  4.7    5.0      4.1    12.8      1
7      1.0   4.8  4.8    4.8      3.3    12.8      1
8      1.0   5.3  4.5    4.3      5.2    12.0      1
9      1.0   4.3  4.3    3.9      2.9    13.6      3
10     1.0   4.3  3.9    4.7      3.9    13.9      1
11     1.0   5.1  4.3    4.5      3.6    14.4      3
12     0.5   3.3  5.4    4.3      3.6    12.3      2
13     0.8   5.9  5.7    7.0      4.1    16.1      3
14     0.7   7.7  6.6    6.7      3.7    16.1      3
15     1.0   7.1  4.4    5.8      4.1    15.5      3
16     0.9   5.5  5.6    5.6      4.4    15.5      3
17     1.0   6.3  5.4    4.8      4.6    13.8      3
18     1.0   5.0  5.5    5.5      4.1    13.8      3
19     1.0   4.6  4.1    4.3      3.1    11.3      1
20     0.9   3.4  5.0    3.4      3.4     7.9      2
21     0.9   6.4  5.4    6.6      4.8    15.1      3
22     1.0   5.5  5.3    5.3      3.8    13.5      3
23     0.7   4.7  4.1    5.0      3.7    10.8      2
24     0.7   4.1  4.0    4.1      4.0     9.5      2
25     1.0   6.0  5.4    5.7      4.7    12.7      3
26     1.0   4.3  4.6    4.7      4.9    11.6      2
27     1.0   3.9  4.0    5.1      5.1    11.7      1
28     1.0   5.1  4.9    5.0      5.1    11.9      2
29     1.0   3.9  4.4    5.0      4.4    10.8      2
30     1.0   4.5  3.7    2.9      3.9     8.5      2
31     1.0   5.2  4.3    5.0      6.0    10.7      2
32     0.8   4.2  3.8    3.0      4.7     9.1      1
33     1.0   3.3  3.5    4.3      4.5    12.1      1
34     1.0   6.8  5.0    6.0      5.2    14.9      3
35     0.8   5.0  5.7    5.5      4.8    13.5      1
36     0.8   3.5  4.7    4.2      3.3    12.2      1
37     0.8   4.3  5.5    3.5      5.8    10.3      1
38     0.8   5.2  4.8    5.7      3.5    13.2      1

换名字

>  colnames(data)<-c("x1","x2","x3","x4","x5","y")
> data
    x1  x2  x3  x4  x5    y NA
1  1.0 3.3 2.8 3.1 4.1  9.8  1
2  1.0 4.4 4.9 3.5 3.9 12.6  1
3  1.0 3.9 5.3 4.8 4.7 11.9  1
4  1.0 3.9 2.6 3.1 3.6 11.1  1
5  1.0 5.6 5.1 5.5 5.1 13.3  1
6  1.0 4.6 4.7 5.0 4.1 12.8  1
7  1.0 4.8 4.8 4.8 3.3 12.8  1
8  1.0 5.3 4.5 4.3 5.2 12.0  1
9  1.0 4.3 4.3 3.9 2.9 13.6  3
10 1.0 4.3 3.9 4.7 3.9 13.9  1
11 1.0 5.1 4.3 4.5 3.6 14.4  3
12 0.5 3.3 5.4 4.3 3.6 12.3  2
13 0.8 5.9 5.7 7.0 4.1 16.1  3
14 0.7 7.7 6.6 6.7 3.7 16.1  3
15 1.0 7.1 4.4 5.8 4.1 15.5  3
16 0.9 5.5 5.6 5.6 4.4 15.5  3
17 1.0 6.3 5.4 4.8 4.6 13.8  3
18 1.0 5.0 5.5 5.5 4.1 13.8  3
19 1.0 4.6 4.1 4.3 3.1 11.3  1
20 0.9 3.4 5.0 3.4 3.4  7.9  2
21 0.9 6.4 5.4 6.6 4.8 15.1  3
22 1.0 5.5 5.3 5.3 3.8 13.5  3
23 0.7 4.7 4.1 5.0 3.7 10.8  2
24 0.7 4.1 4.0 4.1 4.0  9.5  2
25 1.0 6.0 5.4 5.7 4.7 12.7  3
26 1.0 4.3 4.6 4.7 4.9 11.6  2
27 1.0 3.9 4.0 5.1 5.1 11.7  1
28 1.0 5.1 4.9 5.0 5.1 11.9  2
29 1.0 3.9 4.4 5.0 4.4 10.8  2
30 1.0 4.5 3.7 2.9 3.9  8.5  2
31 1.0 5.2 4.3 5.0 6.0 10.7  2
32 0.8 4.2 3.8 3.0 4.7  9.1  1
33 1.0 3.3 3.5 4.3 4.5 12.1  1
34 1.0 6.8 5.0 6.0 5.2 14.9  3
35 0.8 5.0 5.7 5.5 4.8 13.5  1
36 0.8 3.5 4.7 4.2 3.3 12.2  1
37 0.8 4.3 5.5 3.5 5.8 10.3  1
38 0.8 5.2 4.8 5.7 3.5 13.2  1

建立线性回归方程,数据为data

> lma<-lm(y~x1+x2+x3+x4+x5,data=data)
> summary(lma)

Call:
lm(formula = y ~ x1 + x2 + x3 + x4 + x5, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.85552 -0.57448 -0.07092  0.67275  1.68093 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.9969     2.2318   1.791 0.082775 .  
x1            2.3395     1.7348   1.349 0.186958    
x2            0.4826     0.2724   1.771 0.086058 .  
x3            0.2732     0.3326   0.821 0.417503    
x4            1.1683     0.3045   3.837 0.000552 ***
x5           -0.6840     0.2712  -2.522 0.016833 *  
---
Signif. codes:  0***0.001**0.01*0.05.0.1 ‘ ’ 1

Residual standard error: 1.163 on 32 degrees of freedom
Multiple R-squared:  0.7206,    Adjusted R-squared:  0.6769 
F-statistic: 16.51 on 5 and 32 DF,  p-value: 4.703e-08

结果分析:回归方程y=3.9969+2.3395 x1+0.4826x2+0.2732 x3+ 1.1683 x4–0.6840 x5
回归方程的显著性检验:F值=16.51,p值<4.703*10^(-8)<0.01,因此x1,x2,x3,x4,x5对y非常显著的线性影响,回归系数x1,x2,x3,x4,x5的t的检验:

变量x1x2x3x4x5
p值0.1869580.060580.4175030.0005520.016833
t值1.73480.27240.8213.837-2.522

若显著性水平为α=0.05,那么从上面可知只有x4,x5的系数不显著为0

逐步回归R程序

> lm.step<-step(lma,direction="both")
Start:  AIC=16.92
y ~ x1 + x2 + x3 + x4 + x5

       Df Sum of Sq    RSS    AIC
- x3    1    0.9118 44.160 15.709
<none>              43.248 16.916
- x1    1    2.4577 45.706 17.016
- x2    1    4.2397 47.488 18.470
- x5    1    8.5978 51.846 21.806
- x4    1   19.8986 63.147 29.299

Step:  AIC=15.71
y ~ x1 + x2 + x4 + x5

       Df Sum of Sq    RSS    AIC
- x1    1    1.6936 45.853 15.139
<none>              44.160 15.709
+ x3    1    0.9118 43.248 16.916
- x2    1    5.3545 49.514 18.058
- x5    1    8.0807 52.241 20.094
- x4    1   27.3280 71.488 32.014

Step:  AIC=15.14
y ~ x2 + x4 + x5

       Df Sum of Sq    RSS    AIC
<none>              45.853 15.139
+ x1    1    1.6936 44.160 15.709
+ x3    1    0.1477 45.706 17.016
- x2    1    6.6026 52.456 18.251
- x5    1    6.9989 52.852 18.537
- x4    1   25.6888 71.542 30.043

利用逐步回归得到最优回归模型,即y关于x2,x4,x5回归方程

> summary(lm.step)

Call:
lm(formula = y ~ x2 + x4 + x5, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.5707 -0.6256  0.1521  0.6467  1.7741 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   6.4672     1.3328   4.852 2.67e-05 ***
x2            0.5801     0.2622   2.213 0.033740 *  
x4            1.1997     0.2749   4.364 0.000113 ***
x5           -0.6023     0.2644  -2.278 0.029127 *  
---
Signif. codes:  0***0.001**0.01*0.05.0.1 ‘ ’ 1

Residual standard error: 1.161 on 34 degrees of freedom
Multiple R-squared:  0.7038,    Adjusted R-squared:  0.6776 
F-statistic: 26.92 on 3 and 34 DF,  p-value: 4.203e-09

结果分析:y关于x2,x4,x5回归方程为:y=6.4672+0.5801x2+1.1997x4-0.6023x5
F检验:F值=26.92,p值4.203*10^(-9)<0.01,因此x2,x4,x5对y非常显著的线性影响,回归系数t检验:

变量x2x4x5
t值2.2134.364-2.278
p值0.0337400.0001130.029127

若显著性水平为α=0.05,那么从上面可知x2,x4,x5的系数都显著不为0

y预测点估计与区间估计

> preds<-data.frame(x=1.1,x2=5.2,x3=5.6,x4=5.5,x5=14)
> predict(lm.step,newdata=preds,interval="c",level=0.95)
       fit      lwr      upr
1 7.649586 2.429657 12.86951
> predict(lm.step,newdata=preds,interval="prediction",level=0.95)
       fit      lwr      upr
1 7.649586 1.920927 13.37824

结果分析:
均值:7.649586,置信区间[ 2.429657,12.86951]预测区间[1.920927,13.37824]
这里因为一个字母输错了就出来了一个不一样的东西,就因为newdata打成了mewdata

> predict(lm.step,mewdata=preds,interval="c",level=0.95)
         fit       lwr      upr
1   9.631108  8.880601 10.38162
2  10.869583 10.180994 11.55817
3  11.657264 10.952672 12.36186
4  10.280343  9.478989 11.08170
5  13.242323 12.620377 13.86427
6  12.664681 12.205161 13.12420
7  13.022626 12.382474 13.66278
8  11.568423 10.787597 12.34925
9  11.893772 11.048734 12.73881
10 12.251202 11.760227 12.74218
11 12.656057 12.068372 13.24374
12 11.371902 10.574376 12.16943
13 15.818223 14.805646 16.83080
14 16.743462 15.538991 17.94793
15 15.074736 14.102037 16.04744
16 13.725908 13.230446 14.22137
17 13.109785 12.255224 13.96434
18 13.496576 12.964147 14.02900
19 12.427221 11.695597 13.15884
20 10.470656  9.713694 11.22762
21 15.206779 14.397822 16.01574
22 13.727395 13.188910 14.26588
23 12.963623 12.442063 13.48518
24 11.355130 10.874464 11.83580
25 13.955240 13.368040 14.54244
26 11.648877 11.049556 12.24820
27 11.776242 10.872145 12.68034
28 12.352417 11.765997 12.93884
29 12.077900 11.352426 12.80337
30 10.207779  9.207923 11.20763
31 11.868336 10.871301 12.86537
32  9.671853  8.753894 10.58981
33 10.829810 10.039689 11.61993
34 14.478081 13.596799 15.35936
35 13.074948 12.490712 13.65918
36 11.548654 10.772242 12.32507
37  9.667154  8.558959 10.77535
38 14.213933 13.499064 14.92880

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值