对于两组数据之间真的存在差异,需要进行t检验。用python可以同样实现这个功能。具体代码与解释如下:
import numpy as np
group1 = np.array([12, 15])
group2 = np.array([15, 17])
#find variance for each group
print(np.var(group1), np.var(group2))
import scipy.stats as stats
#perform two sample t-test with equal variances
print(stats.ttest_ind(a=group1, b=group2, equal_var=True))
得到的结果为:
image.png
由于p值大于0.05,假设不成立,即两组之间不存在显著差异。
如果换了个数据,如下:
import numpy as np
group1 = np.array([11, 8])
group2 = np.array([19, 17])
#Next, we’ll use the ttest_ind() function from the scipy.
# stats library to conduct a two sample t-test,
# which uses the following syntax: ttest_ind(a, b, equal_var=True) where:
# a: an array of sample observations for group 1
# b: an array of sample observations for group 2 equal_var:
# if True, perform a standard independent 2 sample t-test
# that assumes equal population variances.
# If False, perform Welch’s t-test, which does not assume e
# qual population variances.
# This is True by default. Before we perform the test,
# we need to decide if we’ll assume the two populations have equal variances or not.
# As a rule of thumb, we can assume
# the populations have equal variances
# if the ratio of the larger sample variance to the smaller sample variance is
# less than 4:1.
#find variance for each group
print(np.var(group1), np.var(group2))
print( 2.25/1.0)
#The ratio of the
# larger sample variance to the smaller sample variance is 2.25 / 1.0 = 2.25<4,
#which is less than 4. This means we can assume that the population variances are equal.
#Thus, we can proceed to perform the two sample t-test with equal variances:
import scipy.stats as stats
#perform two sample t-test with equal variances
print(stats.ttest_ind(a=group1, b=group2, equal_var=True))
得到的结果为:
image.png
即:pvalue=0.042158511307681217<0.05 ,因此,两组数据存在显著差异。
对结果的解释方法为:
·此特定的两个样本t检验的两个假设如下:
·H0:µ1 = µ2(两个总体均值相等)
·HA:µ1≠µ2(两个总体均值不相等)
·由于我们测试的p值(0.042158511307681217)小于于alpha = 0.05,因此我们拒绝测试的原假设。
·我们足够的证据说两个种群之间植物的平均高度不同。