正态性检验
Shapiro-Wilk Normality Test
样本量在3-5000时使用
shapiro.test(x)
x a numeric vector of data values. Missing values are allowed, but the number of non-missing values must be between 3 and 5000.
##
## shpr.t> shapiro.test(rnorm(100, mean = 5, sd = 3))
##
## Shapiro-Wilk normality test
##
## data: rnorm(100, mean = 5, sd = 3)
## W = 0.9832, p-value = 0.235
##
##
## shpr.t> shapiro.test(runif(100, min = 2, max = 4))
##
## Shapiro-Wilk normality test
##
## data: runif(100, min = 2, max = 4)
## W = 0.951, p-value = 0.0009664
Kolmogorov-Smirnov Tests
统计量大于5000时采用
ks.test(x, y, ...,
alternative = c("two.sided", "less", "greater"),
exact = NULL)
此函数用于检测x是否与y为相同分布,可以把y定义为正态分布来检测x时候为正态分布。
x表示待检测的样本数据,必须为数值型向量
y可以是数值型向量,也可以是字符型 分布的累计分布函数pnorm,pgamma
example("ks.test")
##
## ks.tst> require(graphics)
##
## ks.tst> x <- rnorm(50)
##
## ks.tst> y <- runif(30)
##
## ks.tst> # Do x and y come from the same distribution?
## ks.tst> ks.test(x, y)
##
## Two-sample Kolmogorov-Smirnov test
##
## data: x and y
## D = 0.44, p-value = 0.0009116
## alternative hypothesis: two-sided
##
##
## ks.tst> # Does x come from a shifted gamma distribution with shape 3 and rate 2?
## ks.tst> ks.test(x+2, "pgamma", 3, 2) # two-sided, exact
##
## One-sample Kolmogorov-Smirnov test
##
## data: x + 2
## D = 0.3566, p-value = 3.376e-06
## alternative hypothesis: two-sided
##
##
## ks.tst> ks.test(x+2, "pgamma", 3, 2, exact = FALSE)
##
## One-sample Kolmogorov-Smirnov test
##
## data: x + 2
## D = 0.3566, p-value = 5.983e-06
## alternative hypothesis: two-sided
##
##
## ks.tst> ks.test(x+2, "pgamma", 3, 2, alternative = "gr")
##
## One-sample Kolmogorov-Smirnov test
##
## data: x + 2
## D^+ = 0.0673, p-value = 0.6088
## alternative hypothesis: the CDF of x lies above the null hypothesis
##
##
## ks.tst> # test if x is stochastically larger than x2
## ks.tst> x2 <- rnorm(50, -1)
##
## ks.tst> plot(ecdf(x), xlim = range(c(x, x2)))
##
## ks.tst> plot(ecdf(x2), add = TRUE, lty = "dashed")
##
## ks.tst> t.test(x, x2, alternative = "g")
##
## Welch Two Sample t-test
##
## data: x and x2
## t = 5.2277, df = 97.997, p-value = 4.853e-07
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.6910036 Inf
## sample estimates:
## mean of x mean of y
## 0.02719708 -0.98547305
##
##
## ks.tst> wilcox.test(x, x2, alternative = "g")
##
## Wilcoxon rank sum test with continuity correction
##
## data: x and x2
## W = 1930, p-value = 1.404e-06
## alternative hypothesis: true location shift is greater than 0
##
##
## ks.tst> ks.test(x, x2, alternative = "l")
##
## Two-sample Kolmogorov-Smirnov test
##
## data: x and x2
## D^- = 0.44, p-value = 6.252e-05
## alternative hypothesis: the CDF of x lies below that of y
两样本检验
参数检验
当样本符合正态分布时,采用参数检验。
未完待续
compile tool
library(knitr)
knit('/Users/lipidong/baiduyun/work/RFile/MarkDown/statistics.Rmd', output = '~/learn/blog/_posts/2015-05-1-statistics.md')