Hypothesis Test Overview

最新推荐文章于 2024-09-03 00:07:24 发布

The Well-Built City

最新推荐文章于 2024-09-03 00:07:24 发布

阅读量481

点赞数

分类专栏： Statistics Misc

本文链接：https://blog.csdn.net/Bill_Wang_01/article/details/113812946

版权

9 篇文章 0 订阅

订阅专栏

$H_0$

Null Hypothesis. Our default assumption of the value of the population parameter of our interest. For example, for simple linear regression, the null hypothesis is the true model’s slope, $\beta_1$ , equals $0$ (i.e. In this example, $H_0$ has nothing to do with the estimator of the true slope, $\hat {\beta_1}$ , which is a random variable, not a constant). Before you have enough evidence from your sample to invalidate this hypothesis, it is always assumed to be true.

Alternative Hypothesis, also a hypothesis regarding the population parameter. This is the hypothesis that we want to test. For example, in SLR, we want to test if the slope is significant i,e, non-zero, so our $H_a$ is $\beta_1≠0$ . Notice that $H_0$ and $H_a$ altogether do not have to enumerate all the possible behavoirs of the parameter. For example, in SLR, you can have $H_0: \beta_1 =0; H_a:\beta_1>0$ i.e. your test can be two-tailed or one-tailed. Once you reject the null hypothesis, you can conclude that this hypothesis holds, but it does not mean that your alternative hypothesis is TRUE, since we can never know the truth of the population.

A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing (from Wikipedia).

See this post for intuitive understanding: https://blog.csdn.net/Bill_Wang_01/article/details/115673440
This is the probability of committing a Type I error, which means the probability of rejecting your null hypothesis when it is actually true. (whereas a Type II error is not rejecting the null hypothesis when the null hypothesis is actually false). This value is usually $5\%$ for t-tests or z-tests, meaning that you preset a $5\%$ probability that you reject the null hypothesis even if it is true, so before enough evidence has been presented by your sample, you won’t reject your default hypothesis to avoid committing a Type I error. In other words, you will not be convineced that your default hypothesis is false until the estimate calculated from your sample is so extreme compared to the value in the null hypothesis that holding your default hypothesis is obviously contradictory to the evidence presented by the sample.

This means the value of the test statistic corresponding to the given significance level. You often look it up in the t or z tables for z/t tests.
For example, for a two tailed $t$ -test given significance level of $5\%$ and sample size $n$ , the critical $t$ value is denoted as $t_{\alpha/2,n-1}$
$t$ distribution is a class of sampling distributions, and you must specify the sample size so that you can pick the correct $t$ distribution to approximate the sampling distribution of your statistic, usually the difference in sample mean
Notice that when you are using two samples, the

This is the probability of getting a sample that is at least as extreme as yours assuming the null hypothesis is true. If this is smaller than the significance level, then you should reject the null hypothesis because the evidence has shown that if the null hypothesis is true, it is super unlikely that an at-least-as-extreme sample will be drawn, so it is highly probably that the default hypothesis is incorrect (so we should reject it). However, notice that even if we conclude the null hypothesis is inplausible, we cannot know the truth. For the word “extreme”: (a) if your test is two tailed, then it means “will produce a test statistic whose absolute value is larger than or equal to that of yours”; (b) if you test is one-tailed, then it means "will produce a test statistic whose value is lager than or equal to(when your test statistic is high and positive) OR smaller than or equal to (when your test statsitic is very negative) that of your sample.

Critical Region is the set of values of the test statistic for which the null hypothesis will be rejected.

For two sample t-tests, when your test is two tailed, you cannot simply compare $t_0|$ with $t *$ . This is because, for example, when your alternative hypothesis is $\mu_1>\mu_2$ , and your statistic is $\bar{y_1}-\bar{y_2}$ , but you got an extremely negative $t$ value. In this case, you should compare $t$ to $t^*$ (positive), and you should not reject the null hypothesis. But if you compared $∣ t ∣$ with $t^*$ , you may end up rejecting the null hypothesis.
Pay attention to where the “tail” is when the test is one tailed!
Two tailed
- $t_0|≥t^*$ : reject $H_0$
- $t_0|<t^*$ : fail to reject $H_0$
One tailed
- $H_a:\mu_1<\mu_2$
  - $t_0≤-t^*$ : reject $H_0$
  - $t_0>-t^*$ : fail to reject $H_0$
- $H_a:\mu_1>\mu_2$
  - $t_0≥t^*$ : reject $H_0$
  - $t_0<t^*$ : fail to reject $H_0$