资料来源:《R 语言核心技术手册》和 R 文档
数据基本来自胡编乱造 和 R 文档
本文基本囊括了常用的统计检验在 R 中的实现函数和使用方法。
连续型数据
基于正态分布的检验
均值检验
t.test(1:10, 10:20)
#>
#> Welch Two Sample t-test
#>
#> data: 1:10 and 10:20
#> t = -7, df = 19, p-value = 2e-06
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -12.4 -6.6
#> sample estimates:
#> mean of x mean of y
#> 5.5 15.0
配对 t 检验:
t.test(rnorm(10), rnorm(10, mean = 1), paired = TRUE)
#>
#> Paired t-test
#>
#> data: rnorm(10) and rnorm(10, mean = 1)
#> t = -5, df = 9, p-value = 7e-04
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -2.541 -0.962
#> sample estimates:
#> mean of the differences
#> -1.75
使用公式:
df value = c(rnorm(10), rnorm(10, mean = 1)),
group = c(rep("control", 10), rep("test", 10))
)
t.test(value ~ group, data = df)
#>
#> Welch Two Sample t-test
#>
#> data: value by group
#> t = -0.4, df = 15, p-value = 0.7
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -1.62 1.08
#> sample estimates:
#> mean in group control mean in group test
#> 0.532 0.802
假设方差同质:
t.test(value ~ group, data = df, var.equal = TRUE)
#>
#> Two Sample t-test
#>
#> data: value by group
#> t = -0.4, df = 18, p-value = 0.7
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -1.60 1.06
#> sample estimates:
#> mean in group control mean in group test
#> 0.532 0.802
更多查看 ?t.test
。
两总体方差检验
上面的例子假设方差同质,我们通过检验查看。
服从正态分布的两总体方差比较。
# 进行的是 F 检验
var.test(value ~ group, data = df)
#>
#> F test to compare two variances
#>
#> data: value by group
#> F = 0.4, num df = 9, denom df = 9, p-value = 0.2
#> alternative hypothesis: true ratio of variances is not equal to 1
#> 95 percent confidence interval:
#> 0.103 1.671
#> sample estimates:
#> ratio of variances
#> 0.415
使用 Bartlett 检验比较每个组(样本)数据的方差是否一致。
bartlett.test(value ~ group, data = df)
#>
#> Bartlett test of homogeneity of variances
#>
#> data: value by group
#> Bartlett's K-squared = 2, df = 1, p-value = 0.2
多个组间均值的比较
对于两组以上数据间均值的比较,使用方差分析 ANOVA。
aov(wt ~ factor(cyl), data = mtcars)
#> Call:
#> aov(formula = wt ~ factor(cyl), data = mtcars)
#>
#> Terms:
#> factor(cyl) Residuals
#> Sum of Squares 18.2 11.5
#> Deg. of Freedom 2 29
#>
#> Residual standard error: 0.63
#> Estimated effects may be unbalanced
查看详细信息:
model.tables(aov(wt ~ factor(cyl), data = mtcars))
#> Tables of effects
#>
#> factor(cyl)
#> 4 6 8
#> -