Why Can We Compare Alpha and P-Value in Hypothesis Tests?
Reference: https://courses.washington.edu/p209s07/lecturenotes/Week%205_Monday%20overheads.pdf
The Motivating Question
- In Hypothesis Tests, one way to know what conclusion to make (whether to reject the null hypothesis) is by comparung α \alpha α with p p p-value. Why can we make this comparison?
α \alpha α, significance level
- Suppose α ∈ [ 0 , 1 ] = x % \alpha\in[0,1]=x\% α∈[0,1]=x% (usually α = 5 % \alpha=5\% α=5%). It simply means, assuming the null hypothesis is true (we never know whether it is true), we are allowed to reject the null hypothesis iff we observed a sample so rare that it would have occured by chance at most x % x\% x% of the time.
- Thus, as α \alpha α gets larger, the minimum standard of considering a random sample as “extreme” gets looser (i.e. a sample does not have to be so rare to be considered unusual when α \alpha α gets larger), which means it’s more unlikely to find a strong evidence against the Null Hypothesis.
p p p-value
- Once α \alpha α has been set, a statistic (like the difference in sample mean), which is basically a numerical summary of the samples’ data, is computed from the sample(s) we obtained.
- Each statistic has an associated probability value called a p p p-value, or the likelihood of an observed statistic occurring due to chance, given the sampling distribution of the statistic (for example, the t t t distribution with a certain sample size n n n).
Why the Comparison Makes Sense?
- As we have observed, α \alpha α determines how extreme our sample must be to reject the null hypothesis, and p p p-value is how extreme our sample is. Moreover, the more extreme the sample needs to be or turns out to be, the smaller the two values are. Therefore, if p p p is smaller than or equal to α \alpha α, we know our sample is extreme enough to allow us reject H 0 H_0 H0 and conclude our result (can be experiment result, research result, etc.) to be significantly different from H 0 H_0 H0.
What is “A (Statistically) Significant Result”?
- This statsment is also equivalent to "A result that is (statsitically) significantly different from H 0 H_0 H0.
- This just means a sample that can present enough (here, enough is equivalent to “statistically significant”) evidence against the null hypothesis i.e. the sample is in favor of the alternative hypothesis.
- Thus, “non-significant” just means the result is in favor of the Null Hypothesis.