统计建模与R软件-第三章习题答案_统计建模与r软件第三章课后答案csdn-CSDN博客

本文链接：https://blog.csdn.net/panguoyuan/article/details/25065491

习题-3.1

答：（1）新建一个文本文件：3.1.txt,内容如下：

74.3 79.5 75.0 73.5 75.8 74.0 73.5 67.2 75.8 73.5 78.8 75.6 73.5 75.0 75.8
72.0 79.5 76.5 73.5 79.5 68.8 75.0 78.8 72.0 68.8 76.5 73.5 72.7 75.0 70.4
78.0 78.8 74.3 64.3 76.5 74.3 74.7 70.4 72.7 76.5 70.4 72.0 75.8 75.8 70.4
76.5 65.0 77.2 73.5 72.7 80.5 72.0 65.0 80.3 71.2 77.6 76.5 68.8 73.5 77.2
80.5 72.0 74.3 69.7 81.2 67.3 81.6 67.3 72.7 84.3 69.7 74.3 71.2 74.3 75.0
72.0 75.4 67.3 81.6 75.0 71.2 71.2 69.7 73.5 70.4 75.0 72.7 67.3 70.3 76.5
73.5 72.0 68.0 73.5 68.0 74.3 72.7 72.7 74.3 70.4

（2）创建自定义函数：myfunction

myfunction<-function(x){
n<-length(x)
m<-mean(x)
v<-var(x)
s<-sd(x)
me<-median(x)
cv<-100*s/m
css<-sum((x-m)^2)
uss<-sum(x^2)
R <- max(x)-min(x)
R1 <-quantile(x,3/4)-quantile(x,1/4)
sm <-s/sqrt(n)
g1 <-n/((n-1)*(n-2))*sum((x-m)^3)/s^3
g2 <-((n*(n+1))/((n-1)*(n-2)*(n-3))*sum((x-m)^4)/s^4-(3*(n-1)^2)/((n-2)*(n-3)))
data.frame(N=n,Mean=m,Var=v,std_dev=s,Median=me,std_mean=sm,CV=cv,CSS=css,USS=uss,R=R,R1=R1,Skewness=g1,Kurtosis=g2,row.names=1)
}

（3）将自定义函数加载到内存
> source("myfunction.r")
（4）将数据读入向量serumdata
> serumdata=scan("3.1.txt")
Read 100 items
> serumdata
[1] 74.3 79.5 75.0 73.5 75.8 74.0 73.5 67.2 75.8 73.5 78.8 75.6 73.5 75.0 75.8 72.0 79.5 76.5 73.5 79.5 68.8 75.0 78.8
[24] 72.0 68.8 76.5 73.5 72.7 75.0 70.4 78.0 78.8 74.3 64.3 76.5 74.3 74.7 70.4 72.7 76.5 70.4 72.0 75.8 75.8 70.4 76.5
[47] 65.0 77.2 73.5 72.7 80.5 72.0 65.0 80.3 71.2 77.6 76.5 68.8 73.5 77.2 80.5 72.0 74.3 69.7 81.2 67.3 81.6 67.3 72.7
[70] 84.3 69.7 74.3 71.2 74.3 75.0 72.0 75.4 67.3 81.6 75.0 71.2 71.2 69.7 73.5 70.4 75.0 72.7 67.3 70.3 76.5 73.5 72.0
[93] 68.0 73.5 68.0 74.3 72.7 72.7 74.3 70.4
（5）执行自定义函数

> myfunction(serumdata)
N Mean Var std_dev Median std_mean CV CSS USS R R1 Skewness Kurtosis
1 100 73.696 15.41675 3.926417 73.5 0.3926417 5.327857 1526.258 544636.3 20 4.6 0.03854249 0.07051809

习题-3.2

（1）画直方图：hist(serumdata,freq=FALSE,col="purple",border="red",density=3,angle=60,main=paste("直方图"),xlab="age",ylab="frequency")

（2）画密度曲线

lines(density(serumdata),col="blue")
x<-64:85
lines(x,dnorm(x,mean(serumdata),sd(serumdata)),col="green")
plot(ecdf(serumdata),verticals=TRUE,do.p=FALSE)
lines(x,pnorm(x,mean(serumdata),sd(serumdata)),col="blue")
qqnorm(serumdata,col="purple")
qqline(serumdata,col="red")

（3）画正态分布概率密度曲线

hist(serumdata,freq=FALSE,col="purple",border="red",density=3,angle=60,main=paste("the histogram of serumdata"),xlab="age",ylab="frequency")/
lines(x,dnorm(x,mean(serumdata),sd(serumdata)),col="green")

（4）绘制经验分布图

plot(ecdf(serumdata),verticals=TRUE,do.p=FALSE)

（5）绘制正态经验分布图

plot(ecdf(serumdata),verticals=TRUE,do.p=FALSE)/
lines(x,pnorm(x,mean(serumdata),sd(serumdata)),col="blue")

（6）绘制QQ图

qqnorm(serumdata,col="purple")

（7）绘制QQ直线

qqnorm(serumdata,col="purple") /
qqline(serumdata,col="red")

习题-3.3

答：（1）制作茎叶图

> stem(serumdata,scale=1)
The decimal point is at the |
64 | 300
66 | 23333
68 | 00888777
70 | 34444442222
72 | 0000000777777755555555555
74 | 033333333700000004688888
76 | 5555555226
78 | 0888555
80 | 355266
82 |
84 | 3
>

（2）作箱线图（notch表示带有缺口）

boxplot(serumdata,col="lightblue",notch=T)

（3）五点总结

> fivenum(serumdata)
[1] 64.3 71.2 73.5 75.8 84.3

习题3.4

答：（1）正态性Shapori-Wilk检验方法

> shapiro.test(serumdata)
Shapiro-Wilk normality test
data: serumdata
W = 0.9897, p-value = 0.6437

（2）Kolmogrov-Smirnov检验，正态性

> ks.test(serumdata,"pnorm",mean(serumdata),sd(serumdata))

One-sample Kolmogorov-Smirnov test

data: serumdata
D = 0.0701, p-value = 0.7097
alternative hypothesis: two-sided

结论：p值>0.05，可认为来自正态分布的总体。

习题-3.9

答：(1)将数据导到studata数据框中

> studata
V1 V2 V3 V4 V5 V6
1 1 alice f 13 56.5 84.0
2 2 becka f 13 65.3 98.0
3 3 gail f 14 64.3 90.0
4 4 karen f 12 56.3 77.0
5 5 kathy f 12 59.8 84.5
6 6 mary f 15 66.5 112.0
7 7 sandy f 11 51.3 50.5
8 8 sharon f 15 62.5 112.5
9 9 tammy f 14 62.8 102.5
10 10 alfred m 14 69.0 112.5
11 11 duke m 14 63.5 102.5
12 12 guido m 15 67.0 133.0
13 13 james m 12 57.3 83.0
14 14 jeffery m 13 62.5 84.0
15 15 john m 12 59.0 99.5
16 16 philip m 16 72.0 150.0
17 17 robert m 12 64.8 128.0
18 18 thomas m 11 57.5 85.0
19 19 william m 15 66.5 112.0
（2）person相关性检验

> attach(studata)
> cor.test(height,weight)

Pearson's product-moment correlation

data: height and weight
t = 2.8298, df = 3, p-value = 0.0662
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.1185906 0.9901188
sample estimates:
cor
0.852915

结论：person的身高与体重是相关的

习题-6.1

答：（1）初始化数据

x=c(5.1,3.5,7.1,6.2,8.8,7.8,4.5,5.6,8.0,6.4)
y=c(1907,1287,2700,2373,3260,3000,1947,2273,3113,2493)

（2）画图plot(x,y)

结论：由此可以看出X与Y是有线性关系的

（2）求x与y的方程：y=140.95+364.18x

> lm.sol=lm(y~1+x)
> summary(lm.sol)

Call:
lm(formula = y ~ 1 + x)

Residuals:
Min 1Q Median 3Q Max
-128.591 -70.978 -3.727 49.263 167.228

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 140.95 125.11 1.127 0.293
x 364.18 19.26 18.908 6.33e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 96.42 on 8 degrees of freedom
Multiple R-squared: 0.9781, Adjusted R-squared: 0.9754
F-statistic: 357.5 on 1 and 8 DF, p-value: 6.33e-08