定理一:设总体X服从正态分布
N
(
μ
1
,
σ
1
2
)
N(\mu_{1},\sigma_{1}^2)
N(μ1,σ12),设总体Y服从正态分布
N
(
μ
2
,
σ
2
2
)
N(\mu_{2},\sigma_{2}^2)
N(μ2,σ22),则统计量
U
=
(
X
ˉ
−
Y
ˉ
)
−
(
μ
1
−
μ
2
)
σ
1
n
1
+
σ
2
n
2
U=\frac{(\bar{X}-\bar{Y})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}}{n_{1}}+\frac{\sigma_{2}}{n_{2}}}}
U=n1σ1+n2σ2(Xˉ−Yˉ)−(μ1−μ2)
服从标准正态分布,即:
U
=
(
X
ˉ
−
Y
ˉ
)
−
(
μ
1
−
μ
2
)
σ
1
n
1
+
σ
2
n
2
∼
N
(
0
,
1
)
U=\frac{(\bar{X}-\bar{Y})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}}{n_{1}}+\frac{\sigma_{2}}{n_{2}}}}\sim N(0,1)
U=n1σ1+n2σ2(Xˉ−Yˉ)−(μ1−μ2)∼N(0,1)
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
from numpy import matlib
import math
plt.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签
plt.rcParams['axes.unicode_minus']=False #用来正常显示负号
n1 = 10 # 样本容量1
m1 = 10000 # 抽样次数1
n2=10#样本容量2
m2=10000#抽样次数2
#N(0,1)
mu1=0
sigma1=1
mu2=0
sigma2=1
X=sigma1 * matlib.randn((n1, m1)) + mu1#模拟抽样,每一列是一个样本容量为n的样本,共抽样m次
Y=sigma2 * matlib.randn((n2, m2)) + mu2
xmean=np.mean(X,axis=0)
ymean=np.mean(Y,axis=0)
Xmean=np.array(xmean).flatten()
Ymean=np.array(ymean).flatten()
U=(Xmean-Ymean-(mu1-mu2))/math.sqrt((sigma1*sigma1)/n1+(sigma2*sigma2)/n2)
c2=stats.norm.isf(0.001, 0,1)
x=np.linspace(-c2, c2, 100)
N, bins, patches =plt.hist(U, 50, density=1, facecolor="yellow", edgecolor="black", alpha=0.7)
plt.plot(x, stats.norm.pdf(x, 0,1),"b", label="N(0,1)")
plt.xlabel('x')
plt.ylabel('pdf')
plt.suptitle("已知方差σ2的两总体的样本均值的分布")
plt.legend()
plt.show()
定理二:设总体X服从正态分布
N
(
μ
1
,
σ
2
)
N(\mu_{1},\sigma^2)
N(μ1,σ2),设总体Y服从正态分布
N
(
μ
2
,
σ
2
)
N(\mu_{2},\sigma^2)
N(μ2,σ2),则
T
=
(
X
ˉ
−
Y
ˉ
)
−
(
μ
1
−
μ
2
)
S
ω
1
n
1
+
1
n
2
∼
t
(
n
1
+
n
2
−
2
)
T=\frac{(\bar{X}-\bar{Y})-(\mu_{1}-\mu_{2})}{S_{\omega}\sqrt{\frac{1}{n_{1}}+\frac{1}{n_{2}}}}\sim t(n_{1}+n_{2}-2)
T=Sωn11+n21(Xˉ−Yˉ)−(μ1−μ2)∼t(n1+n2−2)
其中
S
ω
=
(
n
1
−
1
)
S
1
2
+
(
n
2
−
1
)
S
2
2
n
1
+
n
2
−
2
S_{\omega}=\sqrt{\frac{(n_{1}-1)S_{1}^2+(n_{2}-1)S_{2}^2}{n_{1}+n_{2}-2}}
Sω=n1+n2−2(n1−1)S12+(n2−1)S22
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
from numpy import matlib
import math
plt.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签
plt.rcParams['axes.unicode_minus']=False #用来正常显示负号
n1 = 10 # 样本容量1
m1 = 10000 # 抽样次数1
n2=10#样本容量2
m2=10000#抽样次数2
#N(0,1)
mu1=0
mu2=0
sigma=2
X=sigma * matlib.randn((n1, m1)) + mu1#模拟抽样,每一列是一个样本容量为n的样本,共抽样m次
Y=sigma * matlib.randn((n2, m2)) + mu2
Xmean1=np.mean(X,axis=0)
Xmean2=np.mean(Y,axis=0)
Xmean=np.array(Xmean1).flatten()
Ymean=np.array(Xmean2).flatten()
zhen1=np.var(X, axis=0)#样本方差
zhen2=np.var(Y,axis=0)
S1=np.array(zhen1).flatten()
S2=np.array(zhen2).flatten()
S=np.sqrt(((n1-1)*S1+(n2-1)*S2)/(n1+n2-2))
t=(Xmean-Ymean-(mu1-mu2))/(S*(math.sqrt(1/n1+1/n2)))
c2=stats.t.isf(0.00001, n1+n2-2)
x=np.linspace(-c2, c2, 100)
N, bins, patches =plt.hist(t, 50, density=1, facecolor="yellow", edgecolor="black", alpha=0.7)
plt.plot(x, stats.t.pdf(x, n1+n2-2),"b", label="t(n1+n2-2)")
plt.xlabel('x')
plt.ylabel('pdf')
plt.suptitle("方差σ2相等的两总体的样本均值方差比的分布")
plt.legend()
plt.show()
定理三:设总体X服从正态分布
N
(
μ
1
,
σ
1
2
)
N(\mu_{1},\sigma_{1}^2)
N(μ1,σ12),设总体Y服从正态分布
N
(
μ
2
,
σ
2
2
)
N(\mu_{2},\sigma_{2}^2)
N(μ2,σ22),则统计量
F=
S
1
2
/
σ
1
2
S
2
2
/
σ
2
2
\frac{{S_{1}^2/\sigma_{1}^2}}{{S_{2}^2/\sigma_{2}^2}}
S22/σ22S12/σ12
服从自由度为
(
n
1
−
1
,
n
2
−
1
)
(n_{1}-1,n_{2}-1)
(n1−1,n2−1)的F分布,即
F=
S
1
2
/
σ
1
2
S
2
2
/
σ
2
2
∼
F
(
n
1
−
1
,
n
2
−
1
)
\frac{{S_{1}^2/\sigma_{1}^2}}{{S_{2}^2/\sigma_{2}^2}}\sim F(n_{1}-1,n_{2}-1)
S22/σ22S12/σ12∼F(n1−1,n2−1)
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
from numpy import matlib
import math
plt.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签
plt.rcParams['axes.unicode_minus']=False #用来正常显示负号
n1 = 10 # 样本容量1
m1 = 10000 # 抽样次数1
n2=10#样本容量2
m2=10000#抽样次数2
#N(0,1)
mu1=0
sigma1=1
mu2=0
sigma2=1
X=sigma1 * matlib.randn((n1, m1)) + mu1#模拟抽样,每一列是一个样本容量为n的样本,共抽样m次
Y=sigma2 * matlib.randn((n2, m2)) + mu2
zhen1=np.var(X, axis=0)#样本方差
zhen2=np.var(Y,axis=0)
S1=np.array(zhen1).flatten()
S2=np.array(zhen2).flatten()
F=(S1*sigma2*sigma2)/(S2*sigma1*sigma1)
c2=stats.f.isf(0.001, n1-1,n2-1)
x=np.linspace(0, c2, 100)
N, bins, patches =plt.hist(F, 50, density=1, facecolor="yellow", edgecolor="black", alpha=0.7)
plt.plot(x, stats.f.pdf(x, n1-1,n2-1),"b", label="F(n1-1,n2-1)")
plt.xlabel('x')
plt.ylabel('pdf')
plt.suptitle("已知方差σ2的两总体的样本方差比的分布")
plt.legend()
plt.show()