第四课实验

 x1=round(runif(20,min=50,max=100))
> x1
 [1] 84 72 74 56 72 53 70 75 70 58 89 72 51 91 52 51 70 82 76 74
> x2=round(rnorm(20,mean=50,sd=4))
> x2
 [1] 47 44 45 46 53 51 56 47 47 52 59 47 54 51 51 45 55 54 43 57
> x2=round(rnorm(20,mean=50,sd=10))
> x2=round(rnorm(20,mean=50,sd=4))
> x3=round(rnorm(20,mean=50,sd=10))
> x2
 [1] 46 52 48 56 54 48 47 54 53 49 52 42 47 51 47 49 47 47 52 47
> x3
 [1] 44 48 64 40 59 53 34 45 44 67 42 35 57 52 61 61 42 56 47 40
> plot(x2)
> plot(x3)
> lm.sol=lm(x1~x2)
> lm.sol
Call:
lm(formula = x1 ~ x2)
Coefficients:
(Intercept)           x2 
    56.0520       0.2743 
> summary(lm.sol)
Call:
lm(formula = x1 ~ x2)
Residuals:
    Min      1Q  Median      3Q     Max
-18.490 -12.470   1.413   5.215  20.961
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  56.0520    41.7451   1.343    0.196
x2            0.2743     0.8431   0.325    0.749
Residual standard error: 12.7 on 18 degrees of freedom
Multiple R-squared: 0.005844,   Adjusted R-squared: -0.04939
F-statistic: 0.1058 on 1 and 18 DF,  p-value: 0.7487
> x=data.frame(x1,x2,x3)
> x
   x1 x2 x3
1  84 46 44
2  72 52 48
3  74 48 64
4  56 56 40
5  72 54 59
6  53 48 53
7  70 47 34
8  75 54 45
9  70 53 44
10 58 49 67
11 89 52 42
12 72 42 35
13 51 47 57
14 91 51 52
15 52 47 61
16 51 49 61
17 70 47 42
18 82 47 56
19 76 52 47
20 74 47 40
>
>
>
> lm.sol=lm(x1~x2,data=x)
> lm.sol
Call:
lm(formula = x1 ~ x2, data = x)
Coefficients:
(Intercept)           x2 
    56.0520       0.2743 
> summary(lm.sol)
Call:
lm(formula = x1 ~ x2, data = x)
Residuals:
    Min      1Q  Median      3Q     Max
-18.490 -12.470   1.413   5.215  20.961
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  56.0520    41.7451   1.343    0.196
x2            0.2743     0.8431   0.325    0.749
Residual standard error: 12.7 on 18 degrees of freedom
Multiple R-squared: 0.005844,   Adjusted R-squared: -0.04939
F-statistic: 0.1058 on 1 and 18 DF,  p-value: 0.7487
> lm.sol=lm(x1~x2+x3,data=x)
> lm.sol
Call:
lm(formula = x1 ~ x2 + x3, data = x)
Coefficients:
(Intercept)           x2           x3 
    76.5946       0.3157      -0.4559 
> summary(lm.sol)
Call:
lm(formula = x1 ~ x2 + x3, data = x)
Residuals:
     Min       1Q   Median       3Q      Max
-20.0368  -7.3540  -0.5464   6.7992  22.0118
Coefficients:
            Estimate Std. Error t value Pr(>|t|) 
(Intercept)  76.5946    42.0774   1.820   0.0864 .
x2            0.3157     0.8095   0.390   0.7014 
x3           -0.4559     0.2857  -1.596   0.1290 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.18 on 17 degrees of freedom
Multiple R-squared: 0.1354,     Adjusted R-squared: 0.03364
F-statistic: 1.331 on 2 and 17 DF,  p-value: 0.2905
> plot(x,x1)
> plot(x)
> plot(x)可以看到三个变量关系图
错误: 意外的符号在"plot(x)可以看到三个变量关系图"里
>
>
> plot(x1,x2)
>
>
>
> x=iris[which(iris$Species=="setosa"),1:4]
> plot(x)
> x=iris[,1:4]
> plot(x)
> x=iris[which(iris$Species=="setosa"),1:4]
> cor(x)
             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length    1.0000000   0.7425467    0.2671758   0.2780984
Sepal.Width     0.7425467   1.0000000    0.1777000   0.2327520
Petal.Length    0.2671758   0.1777000    1.0000000   0.3316300
Petal.Width     0.2780984   0.2327520    0.3316300   1.0000000
> 变量间看相关性,cor函数或者看图plot
>
> 多元的难度在变量的选择上,不是所有变量之间都有相关性
>
>
> s=lm(Fertility~.,data=swiss)
> s
Call:
lm(formula = Fertility ~ ., data = swiss)
Coefficients:
     (Intercept)       Agriculture       Examination         Education 
         66.9152           -0.1721           -0.2580           -0.8709 
        Catholic  Infant.Mortality 
          0.1041            1.0770 
> print(s)
Call:
lm(formula = Fertility ~ ., data = swiss)
Coefficients:
     (Intercept)       Agriculture       Examination         Education 
         66.9152           -0.1721           -0.2580           -0.8709 
        Catholic  Infant.Mortality 
          0.1041            1.0770 
> summrary(s)
错误: 没有"summrary"这个函数
> summary(s)
Call:
lm(formula = Fertility ~ ., data = swiss)
Residuals:
     Min       1Q   Median       3Q      Max
-15.2743  -5.2617   0.5032   4.1198  15.3213
Coefficients:
                 Estimate Std. Error t value Pr(>|t|)   
(Intercept)      66.91518   10.70604   6.250 1.91e-07 ***
Agriculture      -0.17211    0.07030  -2.448  0.01873 * 
Examination      -0.25801    0.25388  -1.016  0.31546   
Education        -0.87094    0.18303  -4.758 2.43e-05 ***
Catholic          0.10412    0.03526   2.953  0.00519 **
Infant.Mortality  1.07705    0.38172   2.822  0.00734 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.165 on 41 degrees of freedom
Multiple R-squared: 0.7067,     Adjusted R-squared: 0.671
F-statistic: 19.76 on 5 and 41 DF,  p-value: 5.594e-10
> sl=step(s,direction="forward")
Start:  AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
    Infant.Mortality
> sl=step(s,direction="backward")
Start:  AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
    Infant.Mortality
                   Df Sum of Sq    RSS    AIC
- Examination       1     53.03 2158.1 189.86
                          2105.0 190.69
- Agriculture       1    307.72 2412.8 195.10
- Infant.Mortality  1    408.75 2513.8 197.03
- Catholic          1    447.71 2552.8 197.75
- Education         1   1162.56 3267.6 209.36
Step:  AIC=189.86
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
                   Df Sum of Sq    RSS    AIC
                          2158.1 189.86
- Agriculture       1    264.18 2422.2 193.29
- Infant.Mortality  1    409.81 2567.9 196.03
- Catholic          1    956.57 3114.6 205.10
- Education         1   2249.97 4408.0 221.43
> step方法减少变量,通过对比AIC的大小来决定
> sl=step(s,direction="both")
Start:  AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
    Infant.Mortality
                   Df Sum of Sq    RSS    AIC
- Examination       1     53.03 2158.1 189.86
                          2105.0 190.69
- Agriculture       1    307.72 2412.8 195.10
- Infant.Mortality  1    408.75 2513.8 197.03
- Catholic          1    447.71 2552.8 197.75
- Education         1   1162.56 3267.6 209.36
Step:  AIC=189.86
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
                   Df Sum of Sq    RSS    AIC
                          2158.1 189.86
+ Examination       1     53.03 2105.0 190.69
- Agriculture       1    264.18 2422.2 193.29
- Infant.Mortality  1    409.81 2567.9 196.03
- Catholic          1    956.57 3114.6 205.10
- Education         1   2249.97 4408.0 221.43
>
>
> 还有种方式是通过因变量和自变量之间的关系图,来决定是否引入二次项或交叉项
>
>
> 有点时候AIC出来的结果,可能summary出来的结果不是太好
> 有时需要去step出来AIC次小的
错误: 找不到对象'有时需要去step出来AIC次小的'
>
>
>
> sl
Call:
lm(formula = Fertility ~ Agriculture + Education + Catholic +
    Infant.Mortality, data = swiss)
Coefficients:
     (Intercept)       Agriculture         Education          Catholic 
         62.1013           -0.1546           -0.9803            0.1247 
Infant.Mortality 
          1.0784 
> drop1(sl)
Single term deletions
Model:
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
                 Df Sum of Sq    RSS    AIC
                        2158.1 189.86
Agriculture       1    264.18 2422.2 193.29
Education         1   2249.97 4408.0 221.43
Catholic          1    956.57 3114.6 205.10
Infant.Mortality  1    409.81 2567.9 196.03
> drop1(s)
Single term deletions
Model:
Fertility ~ Agriculture + Examination + Education + Catholic +
    Infant.Mortality
                 Df Sum of Sq    RSS    AIC
                        2105.0 190.69
Agriculture       1    307.72 2412.8 195.10
Examination       1     53.03 2158.1 189.86
Education         1   1162.56 3267.6 209.36
Catholic          1    447.71 2552.8 197.75
Infant.Mortality  1    408.75 2513.8 197.03
> dro1(s)
错误: 没有"dro1"这个函数
> add1(s)
错误于add1.lm(s) : 範疇里没有项
> add1(sl)
错误于add1.lm(sl) : 範疇里没有项
> drop1和step的direction="backward"方式接近
错误: 意外的符号在"drop1和step的direction="backward"方式接近"里
>
>
>
> shapiro.test(x3)
        Shapiro-Wilk normality test
data:  x3
W = 0.9565, p-value = 0.4773
> shapiro.test(x1)
        Shapiro-Wilk normality test
data:  x1
W = 0.9194, p-value = 0.09638
> shapiro.test(x$x1)
错误于complete.cases(x) : 无法用输入来确定个案数目
> shapiro.test(x2)
        Shapiro-Wilk normality test
data:  x2
W = 0.9399, p-value = 0.239
> 正态分布检验shapiro.test(),p值大于0.05为正态分布
错误: 意外的','在"正态分布检验shapiro.test(),"里
>
>
> z=lm(iris$Sepal.length~iris$Sepal.Width)
错误于model.frame.default(formula = iris$Sepal.length ~ iris$Sepal.Width,  :
  参数'iris$Sepal.length'的种类(NULL)不对
> z=lm(iris$Sepal.Length~iris$Sepal.Width)
> z
Call:
lm(formula = iris$Sepal.Length ~ iris$Sepal.Width)
Coefficients:
     (Intercept)  iris$Sepal.Width 
          6.5262           -0.2234 
> summary(z)
Call:
lm(formula = iris$Sepal.Length ~ iris$Sepal.Width)
Residuals:
    Min      1Q  Median      3Q     Max
-1.5561 -0.6333 -0.1120  0.5579  2.2226
Coefficients:
                 Estimate Std. Error t value Pr(>|t|)   
(Intercept)        6.5262     0.4789   13.63   <2e-16 ***
iris$Sepal.Width  -0.2234     0.1551   -1.44    0.152   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8251 on 148 degrees of freedom
Multiple R-squared: 0.01382,    Adjusted R-squared: 0.007159
F-statistic: 2.074 on 1 and 148 DF,  p-value: 0.1519
> plot(z)
等待页面改变的确认...

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/27573546/viewspace-747340/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/27573546/viewspace-747340/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值