WEEK3-Visualizing hypothesis tests in R, t-tests in R

这篇博客是关于在R中可视化假设检验,特别是t测试的视频笔记。涵盖了单尾和双尾测试,解释了第一类错误(类型I错误,即alpha错误)和第二类错误(类型II错误)。此外,还介绍了R中的Student's t-Test,包括默认的S3方法和公式方法,并强调了如何从剪贴板读取数据进行t检验。
摘要由CSDN通过智能技术生成

Explore Statistics with R (EDX)

WEEK3-Visualizing hypothesis tests in R 视频笔记


1. #one tailed test

#rare in health sciences, more common in industrial process control
x=seq(50,140,length=200)
y1=dnorm(x,80, 10)
plot(x,y1,type='l',lwd=2,col='red')
y2=dnorm(x,110, 10)
lines(x,y2,type='l',lwd=2,col='blue')
abline(v=qnorm(0.95,80,10)) #vertical

左边红的 null hypothesis (H0),右边蓝的 alternative hypothesis (H1),灰的竖线 cut off (临界)

这里是第一类错误(type I error, alpha error),弃真



2. #two tailed test

#common in the health sciences, because in most cases, both
#an increase or a decrease in a variable would affect health
x=seq(50,140,length=200)
y1=dnorm(x,80, 10)
plot(x,y1,type='l',lwd=2,col='red')
y2=dnorm(x,110, 10)
lines(x,y2,type='l',lwd=2,col='blue')
abline(v=qnorm(0.025,80,10))
abline(v=qnorm(0.975,80,10))


#colour the rejection area, also referred to as alpha.

#If the p-value is equal to or lower than alpha - reject the null hypothesis
x=seq(50,140,length=200)
y1=dnorm(x,80, 10)
plot(x,y1,type='l',lwd=2,col='red')
y2=dnorm(x,110, 10)
lines(x,y2,type='l',lwd=2,col='blue')
cord.x1 <- c((round(qnorm(0.975, 80, 10))),seq((round(qnorm(0.975, 80, 10))), 120,1),120) 
cord.y1 <- c(0,dnorm(seq((round(qnorm(0.975, 80, 10))), 120, 1), 80, 10),0) 
polygon(cord.x1,cord.y1,col='red')
cord.x2 <- c(50,seq(50,round(qnorm(0.025, 80, 10),1)),round(qnorm(0.025, 80, 10))) 
cord.y2 <- c(0,dnorm(seq(50,round(qnorm(0.025, 80, 10),1)), 80, 10),0) 
polygon(cord.x2,cord.y2,col='red')


#Imagine that the alternative hypothesis were true.

#Beta is the risk that you will keep (=not reject) the false null hypothesis
x=seq(50,140,length=200)
y1=dnorm(x,80, 10)
plot(x,y1,type='l',lwd=2,col='red')
y2=dnorm(x,110, 10)
lines(x,y2,type='l',lwd=2,col='blue')
cord.x2<- c(0,seq((round(1-qnorm(0.025,110,10))),100,1),100)
cord.y2 <- c(0,dnorm(seq((round(1-qnorm(0.025, 110, 10))), 100, 1), 110, 10),0) 
polygon(cord.x2,cord.y2,col='red')
abline(v=round(qnorm(0.975, 80, 10, lower.tail=T)))
abline(v=round(qnorm(0.025, 80, 10, lower.tail=T)))
text(95,0.005, "β",xpd=5)


这里就是第二类错误了,取伪


#Statistical power, 1-beta, is the probability to reject a false null hypothesis
x<- seq(50,140,length=200)
y1<- dnorm(x,80, 10)
plot(x,y1,type='l',lwd=2,col='red')
y2<- dnorm(x,110, 10)
lines(x,y2,type='l',lwd=2,col='blue')
cord.x2<- c(0,seq((round(1-qnorm(0.025,110,10))),100,1),100)
cord.y2 <- c(0,dnorm(seq((round(1-qnorm(0.025, 110, 10))), 100, 1), 110, 10),0) 
polygon(cord.x2,cord.y2,col='red')
abline(v=round(qnorm(0.975, 80, 10, lower.tail=T)))
abline(v=round(qnorm(0.025, 80, 10, lower.tail=T)))
cord.x1 <- c(100,seq(round(qnorm(0.975, 80, 10, lower.tail=T)), 140,1),140) 
cord.y1 <- c(0,dnorm(seq(round(qnorm(0.975, 80, 10, lower.tail=T)),140, 1), 110, 10),0) 
polygon(cord.x1,cord.y1,col='6')
text(95,0.005, "β",xpd=5)
text(115,0.005, "1-β",xpd=5)




WEEK3-t-tests in R 视频笔记


1. 说明文档:Student's t-Test



## Default S3 method:
t.test(x, y = NULL,
       alternative = c("two.sided", "less", "greater"),
       mu = 0, paired = FALSE, var.equal = FALSE,
       conf.level = 0.95, ...)

## S3 method for class 'formula'
t.test(formula, data, subset, na.action, ...)


2. 数据

AB
9776
9887
7889
8090
8194
8496
8598
8599
86100
94106
100109
102112
103113
104105


3. 一些 t检验

由于一直狂按快进看视频,今天突然才明白剪贴板读数据这件事:就是选中一部分数据,复制,然后读,这样就行了。

> weight <- read.table("clipboard", header=T)
> t.test(weight)

	One Sample t-test

data:  weight
t = 48.2568, df = 27, p-value < 2.2e-16   #degrees of freedom = 27 (28 measurements)
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 90.65293 98.70421
sample estimates:
mean of x 
 94.67857 

> t.test(weight$A, weight$B)

	Welch Two Sample t-test

data:  weight$A and weight$B
t = -1.8423, df = 25.684, p-value = 0.077   #compensated degrees of freedom = 25.684
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -14.6635326   0.8063898
sample estimates:
mean of x mean of y 
 91.21429  98.14286 

> t.test(weight$A, weight$B, var.equal=T)

	Two Sample t-test

data:  weight$A and weight$B
t = -1.8423, df = 26, p-value = 0.07686   #28 - 1(per group)
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -14.6589062   0.8017634
sample estimates:
mean of x mean of y 
 91.21429  98.14286 

> t.test(weight$A, weight$B, paired=T)

	Paired t-test

data:  weight$A and weight$B
t = -2.4884, df = 13, p-value = 0.02717
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -12.943697  -0.913446
sample estimates:
mean of the differences 
              -6.928571 


4. 例子


#A function for creating random examples of t-tests

ttest.for.examination <- function(x,y,z,k)
{
subjects <- x
mean1 <- y
mean2 <- z
standarddev <- k
print( c("Number of measurements:   ", x))
print( c("Mean of group 1: ", y))
print( c( "Mean of group 2: ",z))
print( c("Standard deviation:  ", k))
group1 <- round(rnorm(x, y, k))
group2 <- round(rnorm(x, z, k))
framedata <- cbind(group1, group2)
print(framedata)
print( list (t.test(group1, group2, var.equal = T), t.test(group1, group2, var.equal = T, paired = T)))
}


> ttest.for.examination(13,90,105,10)
[1] "Number of measurements:   " "13"                        
[1] "Mean of group 1: " "90"               
[1] "Mean of group 2: " "105"              
[1] "Standard deviation:  " "10"                   
      group1 group2
 [1,]     80    110
 [2,]    101    108
 [3,]     92    109
 [4,]    107    114
 [5,]     76    125
 [6,]     80    107
 [7,]     81    103
 [8,]     96     97
 [9,]     88    113
[10,]    107    118
[11,]    110    110
[12,]    101    107
[13,]     75     95
[[1]]

	Two Sample t-test

data:  group1 and group2
t = -4.115, df = 24, p-value = 0.0003939
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -25.641951  -8.511895
sample estimates:
mean of x mean of y 
 91.84615 108.92308 


[[2]]

	Paired t-test

data:  group1 and group2
t = -4.4543, df = 12, p-value = 0.000787
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -25.430113  -8.723733
sample estimates:
mean of the differences 
              -17.07692 

下面这个P值比较大。

> ttest.for.examination(13,92,100,10)
[1] "Number of measurements:   " "13"                        
[1] "Mean of group 1: " "92"               
[1] "Mean of group 2: " "100"              
[1] "Standard deviation:  " "10"                   
      group1 group2
 [1,]    102     95
 [2,]     99     97
 [3,]     76    107
 [4,]    102    113
 [5,]     97    111
 [6,]     88     93
 [7,]     95     85
 [8,]    103     80
 [9,]     86    102
[10,]     85    106
[11,]     96    112
[12,]    109    105
[13,]     94    103
[[1]]

	Two Sample t-test

data:  group1 and group2
t = -1.5669, df = 24, p-value = 0.1302
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -13.724765   1.878611
sample estimates:
mean of x mean of y 
 94.76923 100.69231 


[[2]]

	Paired t-test

data:  group1 and group2
t = -1.4568, df = 12, p-value = 0.1708
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -14.781916   2.935762
sample estimates:
mean of the differences 
              -5.923077 





  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值