Hypothesis Test Overview

H 0 H_0 H0

  • Null Hypothesis. Our default assumption of the value of the population parameter of our interest. For example, for simple linear regression, the null hypothesis is the true model’s slope, β 1 \beta_1 β1, equals 0 0 0 (i.e. In this example, H 0 H_0 H0 has nothing to do with the estimator of the true slope, β 1 ^ \hat {\beta_1} β1^, which is a random variable, not a constant). Before you have enough evidence from your sample to invalidate this hypothesis, it is always assumed to be true.

H a H_a Ha

  • Alternative Hypothesis, also a hypothesis regarding the population parameter. This is the hypothesis that we want to test. For example, in SLR, we want to test if the slope is significant i,e, non-zero, so our H a H_a Ha is β 1 ≠ 0 \beta_1≠0 β1=0. Notice that H 0 H_0 H0 and H a H_a Ha altogether do not have to enumerate all the possible behavoirs of the parameter. For example, in SLR, you can have H 0 : β 1 = 0 ; H a : β 1 > 0 H_0: \beta_1 =0; H_a:\beta_1>0 H0:β1=0;Ha:β1>0 i.e. your test can be two-tailed or one-tailed. Once you reject the null hypothesis, you can conclude that this hypothesis holds, but it does not mean that your alternative hypothesis is TRUE, since we can never know the truth of the population.

Test statistic

  • A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing (from Wikipedia).

Significance Level α \alpha α

  • See this post for intuitive understanding: https://blog.csdn.net/Bill_Wang_01/article/details/115673440
  • This is the probability of committing a Type I error, which means the probability of rejecting your null hypothesis when it is actually true. (whereas a Type II error is not rejecting the null hypothesis when the null hypothesis is actually false). This value is usually 5 % 5\% 5% for t-tests or z-tests, meaning that you preset a 5 % 5\% 5% probability that you reject the null hypothesis even if it is true, so before enough evidence has been presented by your sample, you won’t reject your default hypothesis to avoid committing a Type I error. In other words, you will not be convineced that your default hypothesis is false until the estimate calculated from your sample is so extreme compared to the value in the null hypothesis that holding your default hypothesis is obviously contradictory to the evidence presented by the sample.

Critical value of the test statistic given the significance level, t α , n t_{\alpha,n} tα,n

  • This means the value of the test statistic corresponding to the given significance level. You often look it up in the t or z tables for z/t tests.
  • For example, for a two tailed t t t-test given significance level of 5 % 5\% 5% and sample size n n n, the critical t t t value is denoted as t α / 2 , n − 1 t_{\alpha/2,n-1} tα/2,n1
  • t t t distribution is a class of sampling distributions, and you must specify the sample size so that you can pick the correct t t t distribution to approximate the sampling distribution of your statistic, usually the difference in sample mean
  • Notice that when you are using two samples, the

p − v a l u e p-value pvalue

  • This is the probability of getting a sample that is at least as extreme as yours assuming the null hypothesis is true. If this is smaller than the significance level, then you should reject the null hypothesis because the evidence has shown that if the null hypothesis is true, it is super unlikely that an at-least-as-extreme sample will be drawn, so it is highly probably that the default hypothesis is incorrect (so we should reject it). However, notice that even if we conclude the null hypothesis is inplausible, we cannot know the truth. For the word “extreme”: (a) if your test is two tailed, then it means “will produce a test statistic whose absolute value is larger than or equal to that of yours”; (b) if you test is one-tailed, then it means "will produce a test statistic whose value is lager than or equal to(when your test statistic is high and positive) OR smaller than or equal to (when your test statsitic is very negative) that of your sample.

Two Ways to Make a Conclusion in Two Sample t-test

Using the p p p-value: Compare p p p to α \alpha α

  • p ≤ α p≤\alpha pα: reject H 0 H_0 H0
  • O.W.: fail to reject H 0 H_0 H0

Using the Critical Region: Compare the test statistic you got to the critical value

  • Critical Region is the set of values of the test statistic for which the null hypothesis will be rejected.

Caveat

  • For two sample t-tests, when your test is two tailed, you cannot simply compare ∣ t 0 ∣ |t_0| t0 with t ∗ t* t. This is because, for example, when your alternative hypothesis is μ 1 > μ 2 \mu_1>\mu_2 μ1>μ2, and your statistic is y 1 ˉ − y 2 ˉ \bar{y_1}-\bar{y_2} y1ˉy2ˉ, but you got an extremely negative t t t value. In this case, you should compare t t t to t ∗ t^* t (positive), and you should not reject the null hypothesis. But if you compared ∣ t ∣ |t| t with t ∗ t^* t, you may end up rejecting the null hypothesis.
  • Pay attention to where the “tail” is when the test is one tailed!
  • Two tailed
    • ∣ t 0 ∣ ≥ t ∗ |t_0|≥t^* t0t: reject H 0 H_0 H0
    • ∣ t 0 ∣ < t ∗ |t_0|<t^* t0<t: fail to reject H 0 H_0 H0
  • One tailed
    • H a : μ 1 < μ 2 H_a:\mu_1<\mu_2 Ha:μ1<μ2
      • t 0 ≤ − t ∗ t_0≤-t^* t0t: reject H 0 H_0 H0
      • t 0 > − t ∗ t_0>-t^* t0>t: fail to reject H 0 H_0 H0
    • H a : μ 1 > μ 2 H_a:\mu_1>\mu_2 Ha:μ1>μ2
      • t 0 ≥ t ∗ t_0≥t^* t0t: reject H 0 H_0 H0
      • t 0 < t ∗ t_0<t^* t0<t: fail to reject H 0 H_0 H0
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值