DATA2002-WEEK6

Permutation test

 lady tea test (Recall fisher test)

truth = c("milk","tea","tea","milk","tea","tea","milk","milk")
permute_gues s= permutations(truth)

B = nrow(permute_guess)
check_correct = vector("numeric", length = B)
for(i in 1:B) {
check_correct[i] = identical(permute_guess[i,], truth)
}
mean(check_correct) # p-value

tt = t.test(weight ~ group, data = dat, var.equal = TRUE) #observed t test
B = 10000 # number of permuted samples we will consider
permuted_dat = dat # make a copy of the data
t_null = vector("numeric", B) # initialise outside loop
for(i in 1:B) {
permuted_dat$group = sample(dat$group)  # this does the permutation
t_null[i] = t.test(weight ~ group, data = permuted_dat)$statistic
}
mean(abs(t_null) >= abs(tt$statistic))

two sided test example, t test can be changed to Wilcoxon rank sum test

Robustly standardised difference in medians

 mad():绝对中位差实际求法是用原数据减去中位数后得到的新数据的绝对值的中位数。但绝对中位差常用来估计标准差,估计标准差=1.4826*绝对中位差。R语言中返回的是估计的标准差。

Paired sample test

We resample the sign.

与t test的t0比,不需要管P-value. 出现t0以及比t0更极端的情况是permutation test 的p-value

忘记了看这个链接,很详细!:Permutation Test: Visual Explanation

Estimation vs hypothesis testing

Estimation

A population parameter is unknown.

Use the sample statistics to generate estimates of the population parameter.

Hypothesis testing

Explicit statement (or hypothesis) regarding the population parameter.

Test statistics are generated which will either support or reject the null hypothesis.

Confidence intervals

We should avoid reporting just a point estimate for a sample, always include a measure of variability \widehat{\Theta} +- critical value * SE(\widehat{\Theta})

 

Bootstrapping

Bootstrapping is a computational process that allows us to as make inferences about the population where no information is available about the population. The classic approach to bootstrapping is to repeatedly resample from the sample (with replacement).

set.seed(123)
B = 10000
result = vector("numeric", length = B)
for(i in 1:B){
newData = sample(speed, replace = TRUE)
result[i] = mean(newData)
}

quantile(result, c(0.025, 0.975)) #95% CI

The bootstrap and the confidence intervals are now very similar.

Trimming the outliers will make the CI of bootstrapping more symmetric.

Summary

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值