x1=round(runif(20,min=50,max=100))
> x1
[1] 84 72 74 56 72 53 70 75 70 58 89 72 51 91 52 51 70 82 76 74
> x2=round(rnorm(20,mean=50,sd=4))
> x2
[1] 47 44 45 46 53 51 56 47 47 52 59 47 54 51 51 45 55 54 43 57
> x2=round(rnorm(20,mean=50,sd=10))
> x2=round(rnorm(20,mean=50,sd=4))
> x3=round(rnorm(20,mean=50,sd=10))
> x2
[1] 46 52 48 56 54 48 47 54 53 49 52 42 47 51 47 49 47 47 52 47
> x3
[1] 44 48 64 40 59 53 34 45 44 67 42 35 57 52 61 61 42 56 47 40
> plot(x2)
> plot(x3)
> lm.sol=lm(x1~x2)
> lm.sol
> x1
[1] 84 72 74 56 72 53 70 75 70 58 89 72 51 91 52 51 70 82 76 74
> x2=round(rnorm(20,mean=50,sd=4))
> x2
[1] 47 44 45 46 53 51 56 47 47 52 59 47 54 51 51 45 55 54 43 57
> x2=round(rnorm(20,mean=50,sd=10))
> x2=round(rnorm(20,mean=50,sd=4))
> x3=round(rnorm(20,mean=50,sd=10))
> x2
[1] 46 52 48 56 54 48 47 54 53 49 52 42 47 51 47 49 47 47 52 47
> x3
[1] 44 48 64 40 59 53 34 45 44 67 42 35 57 52 61 61 42 56 47 40
> plot(x2)
> plot(x3)
> lm.sol=lm(x1~x2)
> lm.sol
Call:
lm(formula = x1 ~ x2)
lm(formula = x1 ~ x2)
Coefficients:
(Intercept) x2
56.0520 0.2743
(Intercept) x2
56.0520 0.2743
> summary(lm.sol)
Call:
lm(formula = x1 ~ x2)
lm(formula = x1 ~ x2)
Residuals:
Min 1Q Median 3Q Max
-18.490 -12.470 1.413 5.215 20.961
Min 1Q Median 3Q Max
-18.490 -12.470 1.413 5.215 20.961
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 56.0520 41.7451 1.343 0.196
x2 0.2743 0.8431 0.325 0.749
Estimate Std. Error t value Pr(>|t|)
(Intercept) 56.0520 41.7451 1.343 0.196
x2 0.2743 0.8431 0.325 0.749
Residual standard error: 12.7 on 18 degrees of freedom
Multiple R-squared: 0.005844, Adjusted R-squared: -0.04939
F-statistic: 0.1058 on 1 and 18 DF, p-value: 0.7487
Multiple R-squared: 0.005844, Adjusted R-squared: -0.04939
F-statistic: 0.1058 on 1 and 18 DF, p-value: 0.7487
> x=data.frame(x1,x2,x3)
> x
x1 x2 x3
1 84 46 44
2 72 52 48
3 74 48 64
4 56 56 40
5 72 54 59
6 53 48 53
7 70 47 34
8 75 54 45
9 70 53 44
10 58 49 67
11 89 52 42
12 72 42 35
13 51 47 57
14 91 51 52
15 52 47 61
16 51 49 61
17 70 47 42
18 82 47 56
19 76 52 47
20 74 47 40
>
>
>
> lm.sol=lm(x1~x2,data=x)
> lm.sol
> x
x1 x2 x3
1 84 46 44
2 72 52 48
3 74 48 64
4 56 56 40
5 72 54 59
6 53 48 53
7 70 47 34
8 75 54 45
9 70 53 44
10 58 49 67
11 89 52 42
12 72 42 35
13 51 47 57
14 91 51 52
15 52 47 61
16 51 49 61
17 70 47 42
18 82 47 56
19 76 52 47
20 74 47 40
>
>
>
> lm.sol=lm(x1~x2,data=x)
> lm.sol
Call:
lm(formula = x1 ~ x2, data = x)
lm(formula = x1 ~ x2, data = x)
Coefficients:
(Intercept) x2
56.0520 0.2743
(Intercept) x2
56.0520 0.2743
> summary(lm.sol)
Call:
lm(formula = x1 ~ x2, data = x)
lm(formula = x1 ~ x2, data = x)
Residuals:
Min 1Q Median 3Q Max
-18.490 -12.470 1.413 5.215 20.961
Min 1Q Median 3Q Max
-18.490 -12.470 1.413 5.215 20.961
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 56.0520 41.7451 1.343 0.196
x2 0.2743 0.8431 0.325 0.749
Estimate Std. Error t value Pr(>|t|)
(Intercept) 56.0520 41.7451 1.343 0.196
x2 0.2743 0.8431 0.325 0.749
Residual standard error: 12.7 on 18 degrees of freedom
Multiple R-squared: 0.005844, Adjusted R-squared: -0.04939
F-statistic: 0.1058 on 1 and 18 DF, p-value: 0.7487
Multiple R-squared: 0.005844, Adjusted R-squared: -0.04939
F-statistic: 0.1058 on 1 and 18 DF, p-value: 0.7487
> lm.sol=lm(x1~x2+x3,data=x)
> lm.sol
> lm.sol
Call:
lm(formula = x1 ~ x2 + x3, data = x)
lm(formula = x1 ~ x2 + x3, data = x)
Coefficients:
(Intercept) x2 x3
76.5946 0.3157 -0.4559
(Intercept) x2 x3
76.5946 0.3157 -0.4559
> summary(lm.sol)
Call:
lm(formula = x1 ~ x2 + x3, data = x)
lm(formula = x1 ~ x2 + x3, data = x)
Residuals:
Min 1Q Median 3Q Max
-20.0368 -7.3540 -0.5464 6.7992 22.0118
Min 1Q Median 3Q Max
-20.0368 -7.3540 -0.5464 6.7992 22.0118
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 76.5946 42.0774 1.820 0.0864 .
x2 0.3157 0.8095 0.390 0.7014
x3 -0.4559 0.2857 -1.596 0.1290
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Estimate Std. Error t value Pr(>|t|)
(Intercept) 76.5946 42.0774 1.820 0.0864 .
x2 0.3157 0.8095 0.390 0.7014
x3 -0.4559 0.2857 -1.596 0.1290
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 12.18 on 17 degrees of freedom
Multiple R-squared: 0.1354, Adjusted R-squared: 0.03364
F-statistic: 1.331 on 2 and 17 DF, p-value: 0.2905
Multiple R-squared: 0.1354, Adjusted R-squared: 0.03364
F-statistic: 1.331 on 2 and 17 DF, p-value: 0.2905
> plot(x,x1)
> plot(x)
> plot(x)可以看到三个变量关系图
错误: 意外的符号在"plot(x)可以看到三个变量关系图"里
>
>
> plot(x1,x2)
>
>
>
> x=iris[which(iris$Species=="setosa"),1:4]
> plot(x)
> x=iris[,1:4]
> plot(x)
> x=iris[which(iris$Species=="setosa"),1:4]
> cor(x)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length 1.0000000 0.7425467 0.2671758 0.2780984
Sepal.Width 0.7425467 1.0000000 0.1777000 0.2327520
Petal.Length 0.2671758 0.1777000 1.0000000 0.3316300
Petal.Width 0.2780984 0.2327520 0.3316300 1.0000000
> 变量间看相关性,cor函数或者看图plot
>
> 多元的难度在变量的选择上,不是所有变量之间都有相关性
>
>
> s=lm(Fertility~.,data=swiss)
> s
> plot(x)
> plot(x)可以看到三个变量关系图
错误: 意外的符号在"plot(x)可以看到三个变量关系图"里
>
>
> plot(x1,x2)
>
>
>
> x=iris[which(iris$Species=="setosa"),1:4]
> plot(x)
> x=iris[,1:4]
> plot(x)
> x=iris[which(iris$Species=="setosa"),1:4]
> cor(x)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length 1.0000000 0.7425467 0.2671758 0.2780984
Sepal.Width 0.7425467 1.0000000 0.1777000 0.2327520
Petal.Length 0.2671758 0.1777000 1.0000000 0.3316300
Petal.Width 0.2780984 0.2327520 0.3316300 1.0000000
> 变量间看相关性,cor函数或者看图plot
>
> 多元的难度在变量的选择上,不是所有变量之间都有相关性
>
>
> s=lm(Fertility~.,data=swiss)
> s
Call:
lm(formula = Fertility ~ ., data = swiss)
lm(formula = Fertility ~ ., data = swiss)
Coefficients:
(Intercept) Agriculture Examination Education
66.9152 -0.1721 -0.2580 -0.8709
Catholic Infant.Mortality
0.1041 1.0770
(Intercept) Agriculture Examination Education
66.9152 -0.1721 -0.2580 -0.8709
Catholic Infant.Mortality
0.1041 1.0770
> print(s)
Call:
lm(formula = Fertility ~ ., data = swiss)
lm(formula = Fertility ~ ., data = swiss)
Coefficients:
(Intercept) Agriculture Examination Education
66.9152 -0.1721 -0.2580 -0.8709
Catholic Infant.Mortality
0.1041 1.0770
(Intercept) Agriculture Examination Education
66.9152 -0.1721 -0.2580 -0.8709
Catholic Infant.Mortality
0.1041 1.0770
> summrary(s)
错误: 没有"summrary"这个函数
> summary(s)
错误: 没有"summrary"这个函数
> summary(s)
Call:
lm(formula = Fertility ~ ., data = swiss)
lm(formula = Fertility ~ ., data = swiss)
Residuals:
Min 1Q Median 3Q Max
-15.2743 -5.2617 0.5032 4.1198 15.3213
Min 1Q Median 3Q Max
-15.2743 -5.2617 0.5032 4.1198 15.3213
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 66.91518 10.70604 6.250 1.91e-07 ***
Agriculture -0.17211 0.07030 -2.448 0.01873 *
Examination -0.25801 0.25388 -1.016 0.31546
Education -0.87094 0.18303 -4.758 2.43e-05 ***
Catholic 0.10412 0.03526 2.953 0.00519 **
Infant.Mortality 1.07705 0.38172 2.822 0.00734 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Estimate Std. Error t value Pr(>|t|)
(Intercept) 66.91518 10.70604 6.250 1.91e-07 ***
Agriculture -0.17211 0.07030 -2.448 0.01873 *
Examination -0.25801 0.25388 -1.016 0.31546
Education -0.87094 0.18303 -4.758 2.43e-05 ***
Catholic 0.10412 0.03526 2.953 0.00519 **
Infant.Mortality 1.07705 0.38172 2.822 0.00734 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.165 on 41 degrees of freedom
Multiple R-squared: 0.7067, Adjusted R-squared: 0.671
F-statistic: 19.76 on 5 and 41 DF, p-value: 5.594e-10
Multiple R-squared: 0.7067, Adjusted R-squared: 0.671
F-statistic: 19.76 on 5 and 41 DF, p-value: 5.594e-10
> sl=step(s,direction="forward")
Start: AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Start: AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
> sl=step(s,direction="backward")
Start: AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Start: AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Df Sum of Sq RSS AIC
- Examination 1 53.03 2158.1 189.86
2105.0 190.69
- Agriculture 1 307.72 2412.8 195.10
- Infant.Mortality 1 408.75 2513.8 197.03
- Catholic 1 447.71 2552.8 197.75
- Education 1 1162.56 3267.6 209.36
- Examination 1 53.03 2158.1 189.86
2105.0 190.69
- Agriculture 1 307.72 2412.8 195.10
- Infant.Mortality 1 408.75 2513.8 197.03
- Catholic 1 447.71 2552.8 197.75
- Education 1 1162.56 3267.6 209.36
Step: AIC=189.86
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
Df Sum of Sq RSS AIC
2158.1 189.86
- Agriculture 1 264.18 2422.2 193.29
- Infant.Mortality 1 409.81 2567.9 196.03
- Catholic 1 956.57 3114.6 205.10
- Education 1 2249.97 4408.0 221.43
> step方法减少变量,通过对比AIC的大小来决定
> sl=step(s,direction="both")
Start: AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
2158.1 189.86
- Agriculture 1 264.18 2422.2 193.29
- Infant.Mortality 1 409.81 2567.9 196.03
- Catholic 1 956.57 3114.6 205.10
- Education 1 2249.97 4408.0 221.43
> step方法减少变量,通过对比AIC的大小来决定
> sl=step(s,direction="both")
Start: AIC=190.69
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Df Sum of Sq RSS AIC
- Examination 1 53.03 2158.1 189.86
2105.0 190.69
- Agriculture 1 307.72 2412.8 195.10
- Infant.Mortality 1 408.75 2513.8 197.03
- Catholic 1 447.71 2552.8 197.75
- Education 1 1162.56 3267.6 209.36
- Examination 1 53.03 2158.1 189.86
2105.0 190.69
- Agriculture 1 307.72 2412.8 195.10
- Infant.Mortality 1 408.75 2513.8 197.03
- Catholic 1 447.71 2552.8 197.75
- Education 1 1162.56 3267.6 209.36
Step: AIC=189.86
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
Df Sum of Sq RSS AIC
2158.1 189.86
+ Examination 1 53.03 2105.0 190.69
- Agriculture 1 264.18 2422.2 193.29
- Infant.Mortality 1 409.81 2567.9 196.03
- Catholic 1 956.57 3114.6 205.10
- Education 1 2249.97 4408.0 221.43
>
>
> 还有种方式是通过因变量和自变量之间的关系图,来决定是否引入二次项或交叉项
>
>
> 有点时候AIC出来的结果,可能summary出来的结果不是太好
> 有时需要去step出来AIC次小的
错误: 找不到对象'有时需要去step出来AIC次小的'
>
>
>
> sl
2158.1 189.86
+ Examination 1 53.03 2105.0 190.69
- Agriculture 1 264.18 2422.2 193.29
- Infant.Mortality 1 409.81 2567.9 196.03
- Catholic 1 956.57 3114.6 205.10
- Education 1 2249.97 4408.0 221.43
>
>
> 还有种方式是通过因变量和自变量之间的关系图,来决定是否引入二次项或交叉项
>
>
> 有点时候AIC出来的结果,可能summary出来的结果不是太好
> 有时需要去step出来AIC次小的
错误: 找不到对象'有时需要去step出来AIC次小的'
>
>
>
> sl
Call:
lm(formula = Fertility ~ Agriculture + Education + Catholic +
Infant.Mortality, data = swiss)
lm(formula = Fertility ~ Agriculture + Education + Catholic +
Infant.Mortality, data = swiss)
Coefficients:
(Intercept) Agriculture Education Catholic
62.1013 -0.1546 -0.9803 0.1247
Infant.Mortality
1.0784
(Intercept) Agriculture Education Catholic
62.1013 -0.1546 -0.9803 0.1247
Infant.Mortality
1.0784
> drop1(sl)
Single term deletions
Single term deletions
Model:
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
Df Sum of Sq RSS AIC
2158.1 189.86
Agriculture 1 264.18 2422.2 193.29
Education 1 2249.97 4408.0 221.43
Catholic 1 956.57 3114.6 205.10
Infant.Mortality 1 409.81 2567.9 196.03
> drop1(s)
Single term deletions
Fertility ~ Agriculture + Education + Catholic + Infant.Mortality
Df Sum of Sq RSS AIC
2158.1 189.86
Agriculture 1 264.18 2422.2 193.29
Education 1 2249.97 4408.0 221.43
Catholic 1 956.57 3114.6 205.10
Infant.Mortality 1 409.81 2567.9 196.03
> drop1(s)
Single term deletions
Model:
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Df Sum of Sq RSS AIC
2105.0 190.69
Agriculture 1 307.72 2412.8 195.10
Examination 1 53.03 2158.1 189.86
Education 1 1162.56 3267.6 209.36
Catholic 1 447.71 2552.8 197.75
Infant.Mortality 1 408.75 2513.8 197.03
> dro1(s)
错误: 没有"dro1"这个函数
> add1(s)
错误于add1.lm(s) : 範疇里没有项
> add1(sl)
错误于add1.lm(sl) : 範疇里没有项
> drop1和step的direction="backward"方式接近
错误: 意外的符号在"drop1和step的direction="backward"方式接近"里
>
>
>
> shapiro.test(x3)
Fertility ~ Agriculture + Examination + Education + Catholic +
Infant.Mortality
Df Sum of Sq RSS AIC
2105.0 190.69
Agriculture 1 307.72 2412.8 195.10
Examination 1 53.03 2158.1 189.86
Education 1 1162.56 3267.6 209.36
Catholic 1 447.71 2552.8 197.75
Infant.Mortality 1 408.75 2513.8 197.03
> dro1(s)
错误: 没有"dro1"这个函数
> add1(s)
错误于add1.lm(s) : 範疇里没有项
> add1(sl)
错误于add1.lm(sl) : 範疇里没有项
> drop1和step的direction="backward"方式接近
错误: 意外的符号在"drop1和step的direction="backward"方式接近"里
>
>
>
> shapiro.test(x3)
Shapiro-Wilk normality test
data: x3
W = 0.9565, p-value = 0.4773
W = 0.9565, p-value = 0.4773
> shapiro.test(x1)
Shapiro-Wilk normality test
data: x1
W = 0.9194, p-value = 0.09638
W = 0.9194, p-value = 0.09638
> shapiro.test(x$x1)
错误于complete.cases(x) : 无法用输入来确定个案数目
> shapiro.test(x2)
错误于complete.cases(x) : 无法用输入来确定个案数目
> shapiro.test(x2)
Shapiro-Wilk normality test
data: x2
W = 0.9399, p-value = 0.239
W = 0.9399, p-value = 0.239
> 正态分布检验shapiro.test(),p值大于0.05为正态分布
错误: 意外的','在"正态分布检验shapiro.test(),"里
>
>
> z=lm(iris$Sepal.length~iris$Sepal.Width)
错误于model.frame.default(formula = iris$Sepal.length ~ iris$Sepal.Width, :
参数'iris$Sepal.length'的种类(NULL)不对
> z=lm(iris$Sepal.Length~iris$Sepal.Width)
> z
错误: 意外的','在"正态分布检验shapiro.test(),"里
>
>
> z=lm(iris$Sepal.length~iris$Sepal.Width)
错误于model.frame.default(formula = iris$Sepal.length ~ iris$Sepal.Width, :
参数'iris$Sepal.length'的种类(NULL)不对
> z=lm(iris$Sepal.Length~iris$Sepal.Width)
> z
Call:
lm(formula = iris$Sepal.Length ~ iris$Sepal.Width)
lm(formula = iris$Sepal.Length ~ iris$Sepal.Width)
Coefficients:
(Intercept) iris$Sepal.Width
6.5262 -0.2234
(Intercept) iris$Sepal.Width
6.5262 -0.2234
> summary(z)
Call:
lm(formula = iris$Sepal.Length ~ iris$Sepal.Width)
lm(formula = iris$Sepal.Length ~ iris$Sepal.Width)
Residuals:
Min 1Q Median 3Q Max
-1.5561 -0.6333 -0.1120 0.5579 2.2226
Min 1Q Median 3Q Max
-1.5561 -0.6333 -0.1120 0.5579 2.2226
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.5262 0.4789 13.63 <2e-16 ***
iris$Sepal.Width -0.2234 0.1551 -1.44 0.152
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.5262 0.4789 13.63 <2e-16 ***
iris$Sepal.Width -0.2234 0.1551 -1.44 0.152
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8251 on 148 degrees of freedom
Multiple R-squared: 0.01382, Adjusted R-squared: 0.007159
F-statistic: 2.074 on 1 and 148 DF, p-value: 0.1519
Multiple R-squared: 0.01382, Adjusted R-squared: 0.007159
F-statistic: 2.074 on 1 and 148 DF, p-value: 0.1519
> plot(z)
等待页面改变的确认...
等待页面改变的确认...
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/27573546/viewspace-747340/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/27573546/viewspace-747340/