Hypothesis Testing

We first set up a null hypothesis that describes the status,then
state an alternative hypothesis. In the end, we either need to:

  • reject the null hypothesis and accept the alternative hypothesis or
  • accept the null hypothesis and reject the alternative hypothesis.

Task

  • if a new weight loss pill helped people lose more weight:
    • null hypothesis: patients who went on the weight loss pill lost no more weight than those who didn’t.
    • alternative hypothesis: patients who went on the weight loss pill lost more weight than those who didn’t.

Research design

  • Group A was given a placebo, or fake, pill and instrcuted to consumer
    it on a daily basis.
  • Group B was given the actual weight loss pill and instructed to
    consume it on a daily basis.

This type of study is called a blind experiment since the participants didn’t know which pill they were receiving. This helps us reduce the potential bias that is introduced when participants know which pill they were given.

Statistical significance

Statistics helps us determine if the difference in the weight lost between the 2 groups is because of random chance or because of an actual difference in the outcomes.

The lists weight_lost_a and weight_lost_b contain the amount of weight (in pounds) that the participants in each group lost.

import numpy as np
import matplotlib.pyplot as plt
mean_group_a = np.mean(weight_lost_a)
print(mean_group_a) 
mean_group_b = np.mean(weight_lost_b)
print(mean_group_b)
plt.hist(weight_lost_a)
plt.show()
plt.hist(weight_lost_b)
plt.show()

output:

2.82
5.34 A组 B组

Test statistic

The first step is to decide on a test statistic, which is a numerical value that summarizes the data and we can use in statistical formulas.

Now that we have decided on a test statistic, we can rewrite our
hypotheses to be more precise:
这里写图片描述

mean_difference = mean_group_b - mean_group_a 
print(mean_difference )

output:

2.52

Permutation test

The permutation test is a statistical test that involves simulating rerunning the study many times and recalculating the test statistic for each iteration. The goal is to calculate a distribution of the test statistics over these many iterations. This distribution is called the sampling distribution and it approximates the full range of possible test statistics under the null hypothesis.

If the observed mean difference of 2.52 should be quite common in the sampling distribution, the null hypothesis is true, and the weight loss pill doesn’t help people lose more weight . Otherwise, we accept the alternative hypothesis instead.

Ideally, the number of times we re-randomize the groups that each data point belongs to matches the total number of possible permutations.

mean_difference = 2.52
mean_differences = []
for i in range(1000):
    group_a = []
    group_b = []
    for value in all_values:
        assignment_chance = np.random.rand()
        if assignment_chance >= 0.5:
            group_a.append(value)
        else:
            group_b.append(value)
    iteration_mean_difference = np.mean(group_b) - np.mean(group_a)
    mean_differences.append(iteration_mean_difference)
plt.hist(mean_differences)
plt.show()

这里写图片描述

Sampling distribution

The keys in the dictionary should be the test statistic and the values should be their frequency:
每个difference出现的频数

sampling_distribution = {}
for difference in mean_differences:
    if difference not in sampling_distribution:
        sampling_distribution[difference] = 1
    else:
        sampling_distribution[difference] += 1

P value

We can now use the sampling distribution to determine the number of times a value of 2.52 or higher appeared in our simulations.If we then divide that frequency by 1000, we’ll have the probability of observing a mean difference of 2.52 or higher purely due to random chance.This probability is called the p value.

In general, it’s good practice to set the p value threshold before conducting the study:

  • if the p value is less than the threshold, we:
    • reject the null hypothesis that there’s no difference in mean amount of weight lost by participants in both groups.
    • accept the alternative hypothesis that the people who consumed the weight loss pill lost more weight.
    • conclude that the weight loss pill does affect the amount of weight people lost.
  • if the p value is greater than the threshold, we:
    • accept the null hypothesis that there’s no difference in the mean amount of weight lost by participants in both groups,
    • reject the alternative hypothesis that the people who consumed the weight loss pill lost more weight,
    • conclude that the weight loss pill doesn’t seem to be effective in helping people lose more weight.

The most common p value threshold is 0.05 or 5%

Caveats

Since the p value of 0 is less than the threshold we set of 0.05, we conclude that the difference in weight lost can’t be attributed to random chance alone. We therefore reject the null hypothesis and accept the alternative hypothesis.

A few caveats:

  • Research design is incredibly important and can bias your results.For example, if the participants in group A realized they were given placebo sugar pills, they may modify their behavior and affect the outcome.
  • The p value threshold you set can also affect the conclusion you
    reach.
    • If you set too high of a p value threshold, you may accept the
      alternative hypothesis incorrectly and fail to reject the null
      hypothesis. This is known as a type I error.
    • If you set too low of a p value threshold, you may reject the
      alternative hypothesis incorrectly in favor of accepting the null
      hypothesis. This is known as a type II error.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: 假设检验是统计学中一种用于检验随机样本是否来自某一特定分布的方法。它通常用来决定一个假设(称为原假设)是否被拒绝或接受。通常有两种假设:原假设和备择假设。原假设是我们要证明或否定的假设,而备择假设则是原假设的补集。 ### 回答2: 假设检验(hypothesis testing)是统计学中最基本、应用最广泛的统计推断方法之一,它用于判断样本信息是否支持某个关于总体的假设,以此为基础作出决策。假设检验的基本思想是,我们提出一个关于总体的某种假设,并利用样本信息对该假设进行验证或证否,进而做出正确的统计推断。 在假设检验中,我们通常会根据问题的特定要求形式化出待检验的假设,它通常被分成两种类型,即零假设(null hypothesis)和备择假设(alternative hypothesis)。零假设是指我们需要验证的假设,通常表示一种相对稳定、均衡、无变化的情况或假设。备择假设则是指我们需要证明零假设错误或不成立的假设,通常表示一种相对不稳定、非均衡、具有变化的情况或假设。对于不同的问题,可选择适当的零假设和备择假设。 在假设检验的过程中,通常需要选择适当的统计量来计算样本数据。如均值检验中通常选择t检验或z检验,比例检验中通常选择卡方检验等。然后,利用所选的统计量将原假设的概率映射到检验统计量的分布上,从而得到检验统计量的观测值,并确定其是否落在某一特定的拒绝域内。如果观测值落在拒绝域内,则拒绝原假设,并认为备择假设更为可能成立。反之,如果观测值未落在拒绝域内,则无法拒绝原假设,无法证明备择假设更为正确。 在进行假设检验时,还需确定显著性水平,它代表了接受备择假设需要达到的信心程度。通常,常用的显著性水平是0.05或0.01,即在拒绝零假设之前,需要使错误接受备择假设的概率小于或等于给定的显著性水平。 总之,假设检验作为一种统计推断方法,可以帮助统计学家和决策者正确地理解和分析数据,对研究或决策进行支持和指导。 ### 回答3: 假设检验(Hypothesis testing)是一种用来推断与研究问题相关性的统计方法。该方法理论基础是根据样本数据评估一个总体参数的假设,然后使用统计分析来确定这个假设是应该接受还是拒绝。 假设检验有两种假设,即零假设和备择假设。零假设通常是一个默认假设,即当我们没有证据来支持备择假设时,零假设成立。例如,当我们研究一种药物是否真的能够治疗某种疾病时,零假设是这种药物无效;备择假设是这种药物有效。 假设检验的步骤包括: 1. 确定零假设与备择假设; 2. 确定显著性水平(α),即出现假阳性或假阴性的风险; 3. 获取样本数据并计算统计量; 4. 计算p值,即在零假设成立的情况下,得到观察值或更“极端”观察值的概率; 5. 判断能否拒绝零假设,即p值小于显著性水平(拒绝域),则拒绝零假设。 假设检验的优点是可以用来确定假设是否成立,帮助研究者做出决策。但是,假设检验也有一些局限性,例如: 1. 假设检验并不提供有关总体参数的确切值或置信区间; 2. 如果样本容量小,假设检验的结果可能不准确; 3. 正确的假设检验需要正确地选择假设和显著性水平。如果这些选择不正确,结果可能会偏差。 总之,假设检验是一种简单的推理方法,用于研究问题或比较不同种类的数据。研究者可以通过该方法确定已知参数值的有效性,以及推导结果是随机还是巧合。但是,正确应用假设检验需要仔细考虑所选择的假设和显著性水平,以及样本数据的大小。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值