一些统计量
总体 X X X
简单随机样本 X i ( i = 1 , 2 , . . . , n ) X_i(i=1,2,...,n) Xi(i=1,2,...,n):
- 每个个体与总体同分布
- 各个体之间相互独立(independent and identically distributed, i.i.d.)
伽马函数
Γ
(
γ
)
\Gamma(\gamma)
Γ(γ):
Γ
(
γ
)
=
∫
0
+
∞
x
γ
−
1
e
−
x
d
x
\Gamma(\gamma)=\int_0^{+\infty}x^{\gamma-1}\text{e}^{-x}dx
Γ(γ)=∫0+∞xγ−1e−xdx
一、样本数字特征和样本矩
样本均值:
X
‾
=
1
n
∑
i
=
1
n
X
i
\overline{X}=\frac{1}{n}\sum\limits_{i=1}^nX_i
X=n1i=1∑nXi
样本方差:
S
2
=
1
n
−
1
∑
i
=
1
n
(
X
i
−
X
‾
)
2
S^2=\frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{X})^2
S2=n−11i=1∑n(Xi−X)2
样本标准差:
S
=
1
n
−
1
∑
i
=
1
n
(
X
i
−
X
‾
)
2
,
S
=
S
2
S=\sqrt{\frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{X})^2},\ S=\sqrt{S^2}
S=n−11i=1∑n(Xi−X)2, S=S2
样本
k
k
k阶中心矩:
A
k
=
1
n
∑
i
=
1
n
X
i
k
A_k=\frac{1}{n}\sum_{i=1}^nX_i^k
Ak=n1i=1∑nXik
样本
k
k
k阶原点矩:
B
k
=
1
n
∑
i
=
1
n
(
X
i
−
X
‾
)
k
B_k=\frac{1}{n}\sum_{i=1}^n(X_i-\overline{X})^k
Bk=n1i=1∑n(Xi−X)k
二、抽样分布
-
χ 2 \chi^2 χ2分布(chi-squared distribution)
定义:
χ 2 = ∑ i = 1 n X i 2 , X i ∼ N ( 0 , 1 ) \chi^2=\sum_{i=1}^nX_i^2,\ X_i\sim N(0,1) χ2=i=1∑nXi2, Xi∼N(0,1)
概率密度函数 ( p r o b a b i l i t y d e n s i t y f u n c t i o n , p d f ) (probability\ density\ function , pdf) (probability density function,pdf):
f ( x ) = { 1 2 n / 2 Γ ( n / 2 ) x ( n / 2 ) − 1 e − n / 2 , x ≥ 0 , 0 , x < 0. f(x) = \left\{ \begin{aligned} \frac{1}{2^{n/2} \Gamma\left(n/2\right)} x^{(n/2)-1} \mathrm{e}^{-n/2}&, & x \geq 0 ,\\ \\ 0&, & x<0. \end{aligned} \right. f(x)=⎩⎪⎪⎪⎨⎪⎪⎪⎧2n/2Γ(n/2)1x(n/2)−1e−n/20,,x≥0,x<0.
-
t t t分布(Student’s distribution)
定义:
T = X Y / n , X ∼ N ( 0 , 1 ) , Y ∼ χ 2 ( n ) T=\frac{X}{\sqrt{Y/n}},\ X\sim N(0,1),\ Y\sim \chi^2(n) T=Y/nX, X∼N(0,1), Y∼χ2(n)
概率密度函数:
f ( t ) = Γ ( ( n + 1 ) / 2 ) n π Γ ( n / 2 ) ( 1 + t 2 / n ) − ( n − 1 ) / 2 . f(t)=\frac{\Gamma((n+1)/2)}{\sqrt{n\pi}\Gamma(n/2)}(1+t^2/n)^{-(n-1)/2}. f(t)=nπΓ(n/2)Γ((n+1)/2)(1+t2/n)−(n−1)/2.
-
F F F分布(F-distribution, Ronald.A.Fisher)
定义:
F = U / n 1 V / n 2 , U ∼ χ 2 ( n 1 ) , V ∼ χ 2 ( n 2 ) , U , V 相 互 独 立 F=\frac{U/n_1}{V/n_2},\ U\sim \chi^2(n_1),\ V\sim \chi^2(n_2),\ U,V相互独立 F=V/n2U/n1, U∼χ2(n1), V∼χ2(n2), U,V相互独立
概率密度函数:
f ( y ) = { Γ ( ( n 1 + n 2 ) / 2 ) Γ ( n 1 / 2 ) Γ ( n 2 / 2 ) ( n 1 / n 2 ) n 1 / 2 y ( n 1 / 2 ) − 1 ( 1 + n 1 y / n 2 ) − ( n 1 + n 2 ) / 2 , y ≥ 0 , 0 , y < 0. f(y)= \left\{ \begin{aligned} \frac{\Gamma((n_1+n_2)/2)}{\Gamma(n_1/2)\Gamma(n_2/2)}(n_1/n_2)^{n_1/2}y^{(n_1/2)-1}(1+n_1y/n_2)^{-(n_1+n_2)/2}&, & y\geq 0, \\ \\ 0&, & y<0. \end{aligned} \right. f(y)=⎩⎪⎪⎪⎨⎪⎪⎪⎧Γ(n1/2)Γ(n2/2)Γ((n1+n2)/2)(n1/n2)n1/2y(n1/2)−1(1+n1y/n2)−(n1+n2)/20,,y≥0,y<0.
概率密度函数生成代码
# -*- coding: utf-8 -*-
"""
Created on Sat Nov 14 19:28:14 2020
@author : MrX_OvO
@email : 1176471624@qq.com
@copyright: MrX_OvO Sat Nov 14 19:28:14 2020
To:
One is never too old to learn.
####################################################################
// _ooOoo_ //
// o8888888o //
// 88" . "88 //
// (| ^_^ |) //
// O\ = /O //
// ____/`---'\____ //
// .' \\| |// `. //
// / \\||| : |||// \ //
// / _||||| -:- |||||- \ //
// | | \\\ - /// | | //
// | \_| ''\---/'' | | //
// \ .-\__ `-` ___/-. / //
// ___`. .' /--.--\ `. . ___ //
// ."" '< `.___\_<|>_/___.' >'"". //
// | | : `- \`.;`\ _ /`;.`/ - ` : | | //
// \ \ `-. \_ __\ /__ _/ .-` / / //
// ========`-.____`-.___\_____/___.-`____.-'======== //
// `=---=' //
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ //
// 佛祖保佑 永不宕机 永无BUG //
####################################################################
"""
import numpy as np
def colors(plot_nums = 7):
color = []
for i in range(plot_nums):
color.append(np.random.rand(1, 3).tolist()[0])
return color
def labels(plot_nums = 7, types = 'chi2'):
labels = []
for i in range(plot_nums):
if types == 'F':
labels.append(r'n1=%i,n2=%i' %(i ** 4 + 10, (i + 2) * 20))
else:
labels.append(r'n=%i' % (i + 1))
return labels
def plot(color, labels, types = 'chi2',plot_nums = 7, p = 10e-8):
import matplotlib.pyplot as plt
from scipy import stats as st
size, fsize = 3000, (16, 9)
t0 = int(size / plot_nums * .164)
fig, ax = plt.subplots(figsize=fsize)
x = np.linspace(st.chi2.ppf(p, plot_nums), st.chi2.ppf(1 - p, plot_nums), size)
if types == 'Student':
x = np.linspace(st.t.ppf(p, plot_nums), st.t.ppf(1 - p, plot_nums), size)
elif types == 'F':
x = np.linspace(
st.f.ppf(p, (plot_nums - 1) ** 4 + 10, (plot_nums + 1) * 20),
st.f.ppf(1 - p, (plot_nums - 1 )** 4 + 10, (plot_nums + 1) * 20),
size)
connstyle=[]
for i in range(plot_nums):
connstyle.append(r'arc3, rad = .%i' %((plot_nums - i) / 5))
for i in range(plot_nums):
y = st.chi2.pdf(x, i + 1)
text = r'n=%i' %(i + 1)
x_, y_ = x[(i + 1) * t0], y[(i + 1) * t0]
if types == 'Student':
t = int(size / 2)
y = st.t.pdf(x, i + 1)
x_, y_ = x[t], y[t]
elif types == 'F':
t = int(size / 6)
y = st.f.pdf(x, i ** 4 + 10, (i + 2) * 20)
x_, y_ = x[t], y[t]
text = r'n1=%i,n2=%i' %(i ** 4 + 10, (i + 2) * 20)
ax.plot(x, y, color = color[i])
ax.annotate(text, xy = (x_, y_),
xytext = ((plot_nums - i + 1) / plot_nums * 80,
(plot_nums - i + 1) / plot_nums * 10),
textcoords = 'offset points',
arrowprops = dict(arrowstyle = '->',
connectionstyle = connstyle[i],
color = color[i],
),
)
ax.set_xlabel('x')
ax.set_ylabel('y=f(x)')
ax.set_xlim(-0.1, 20)
ax.set_ylim(-0.001, 0.5)
ax.legend(labels)
ax.set_title(r'pdf of %s' %types)
if types == 'Student':
#data表示通过值来设置x轴的位置,将x轴绑定在y=0的位置
ax.spines['bottom'].set_position(('data',0))
#axes表示以百分比的形式设置轴的位置,即将y轴绑定在x轴50%的位置,也就是x轴的中点
ax.spines['left'].set_position(('axes',0.5))
#设置图片的右边框和上边框为不显示
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.set_xlim(-5, 5)
ax.set_ylim(-0.01, 0.4)
elif types == 'F':
ax.set_xlim(-0.01, 5)
ax.set_ylim(-0.01, 2)
plt.savefig(r'C:/Users/MrX/Desktop/homework/spyder_test/%s.png' %types)
plt.show()
def main():
type1, type2 = 'Student', 'F'
plot_nums1, plot_nums2 = 5, 3
color0, color1, color2 = colors(), colors(plot_nums1), colors(plot_nums2)
labels0, labels1, labels2 = labels(), labels(plot_nums1), labels(plot_nums2, type2)
# chi2
plot(color0, labels0)
# Student
plot(color1, labels1, type1, plot_nums1)
# F
plot(color2, labels2, type2, plot_nums2)
if __name__ == '__main__':
main()