统计系列(三)利用Python进行参数估计

统计系列(三)利用Python进行参数估计

点估计

样本均值估计为总体均值,样本比例估计为总体比例。

import numpy as np

x = [1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0] # 1为成功,0为失败。估计成功失败比率

theta = np.mean(x)
h = theta/(1-theta)
print(h)
1.2500000000000002

区间估计

单正态总体均值区间估计

  • 方差 σ \sigma σ已知时 μ \mu μ的置信区间

    Exp: X ∼ N ( μ , 0.6 ) X \sim N(\mu, 0.6) XN(μ,0.6),样本x=[14.6, 15.1, 14.9, 14.8, 15.2, 15.1],估计样本均值置信度为95%的置信区间

import numpy as np
import scipy.stats as ss

n = 6; p = 0.025; sigma = np.sqrt(0.6) 
x = [14.6, 15.1, 14.9, 14.8, 15.2, 15.1]
xbar = np.mean(x)
low = xbar-ss.norm.ppf(q=1-p)*(sigma/np.sqrt(n))
up = xbar+ss.norm.ppf(q=1-p)*(sigma/np.sqrt(n))

print([low,up])
[14.330204967695439, 15.569795032304564]
  • 方差 σ \sigma σ未知时 μ \mu μ的置信区间

    Exp: X ∼ N ( μ , σ 2 ) X \sim N(\mu, \sigma^{2}) XN(μ,σ2),样本x=[99.3, 98.7, 100.5, 101.2, 98.3, 99.7, 99.5, 102.1, 100.5],估计样本均值置信度为95%的置信区间

import numpy as np
import scipy.stats as ss
from scipy.stats import t

n = 9; p = 0.025;
x = [99.3, 98.7, 100.5, 101.2, 98.3, 99.7, 99.5, 102.1, 100.5]
xbar = np.mean(x)
s2 = np.var(x, ddof=1) # 样本方差除以n-1
s = np.sqrt(s2)
low = xbar-ss.t.ppf(1-p, n-1)*(s/np.sqrt(n))
up = xbar+ss.t.ppf(1-p, n-1)*(s/np.sqrt(n))

print([low,up])
[99.04599342616191, 100.90956212939363]

单正态总体方差区间估计

  • μ \mu μ未知时方差 σ 2 \sigma^{2} σ2的置信区间

    实际中 μ \mu μ已知的情况极少。Exp:样本量为16,样本均值为12.8,样本方差为0.0023。估计总体方差置信度为95%的置信区间

from scipy.stats import chi2

n=16; s2=0.0023; p=0.025
low = ((n-1)*s2)/chi2.ppf(1-p,n-1)
up = ((n-1)*s2)/chi2.ppf(p,n-1)
print([low,up])
[0.0012550751937877684, 0.005509300678006194]

双正态总体均值差区间估计

  • σ 1 2 \sigma_{1}^{2} σ12 σ 2 2 \sigma_{2}^{2} σ22已知,求 μ 1 − μ 2 \mu{1}-\mu{2} μ1μ2的置信区间

    Exp:样x=[628, 583, 510, 554, 612, 523, 530, 615],且 x ∼ N ( μ , 2140 ) x \sim N(\mu, 2140) xN(μ,2140);样本y=[535, 433, 398, 470, 567, 480, 498, 560, 503, 426],且 y ∼ N ( μ , 3250 ) y \sim N(\mu, 3250) yN(μ,3250),求两样本均值差的95%置信区间

import numpy as np
import scipy.stats as ss

x = [628, 583, 510, 554, 612, 523, 530, 615]
y = [535, 433, 398, 470, 567, 480, 498, 560, 503, 426]
n1 = len(x); n2 = len(y)
xbar = np.mean(x); ybar = np.mean(y)
x_s2 = 2140; y_s2=3250; p=0.025
low = xbar-ybar-ss.norm.ppf(q=1-p)*np.sqrt(x_s2/n1+y_s2/n2)
up = xbar-ybar+ss.norm.ppf(q=1-p)*np.sqrt(x_s2/n1+y_s2/n2)
print([low,up])
[34.66688380095825, 130.08311619904174]
  • 两总体方差未知,但已知 σ 1 2 = σ 2 2 = σ 2 \sigma_{1}^{2}=\sigma_{2}^{2}=\sigma^{2} σ12=σ22=σ2,求 μ 1 − μ 2 \mu{1}-\mu{2} μ1μ2的置信区间

    Exp:样本x=[628, 583, 510, 554, 612, 523, 530, 615],且 x ∼ N ( μ , σ 2 ) x \sim N(\mu, \sigma^{2}) xN(μ,σ2);样本y=[535, 433, 398, 470, 567, 480, 498, 560, 503, 426],且 y ∼ N ( μ , σ 2 ) y \sim N(\mu, \sigma^{2}) yN(μ,σ2),求两样本均值差的95%置信区间

import numpy as np
import scipy.stats as ss

x = [628, 583, 510, 554, 612, 523, 530, 615]
y = [535, 433, 398, 470, 567, 480, 498, 560, 503, 426]
n1 = len(x); n2 = len(y)
xbar = np.mean(x); ybar = np.mean(y)
x_s2 = np.var(x); y_s2=np.var(y); p=0.025
s2 = ((n1-1)*x_s2+(n2-1)*y_s2)/(n1-1+n2-1)
low = xbar-ybar-ss.t.ppf(1-p, n1+n2-1)*np.sqrt(s2*(1/n1+1/n2))
up = xbar-ybar+ss.t.ppf(1-p, n1+n2-1)*np.sqrt(s2*(1/n1+1/n2))
print([low,up])
[32.65868498172926, 132.09131501827073]

双正态总体方差比区间估计

  • μ \mu μ未知时方差 σ 1 2 / σ 2 2 \sigma_{1}^{2} / \sigma_{2}^{2} σ12/σ22的置信区间

    Exp:x=[20.5, 19.8, 19.7, 20.4, 20.1, 20.0, 19.0, 19.9],且 x ∼ N ( μ 1 , σ 1 2 ) x \sim N(\mu_{1}, \sigma_{1}^{2}) xN(μ1,σ12); y=[20.7, 19.8, 19.5, 20.8, 20.4, 19.6, 20.2],且 y ∼ N ( μ 2 , σ 2 2 ) y \sim N(\mu_{2}, \sigma_{2}^{2}) yN(μ2,σ22),试求两方差比的95%置信区间

import numpy as np
from scipy.stats import f

x = [20.5, 19.8, 19.7, 20.4, 20.1, 20.0, 19.0, 19.9]
y = [20.7, 19.8, 19.5, 20.8, 20.4, 19.6, 20.2]

x_s2 = np.var(x); y_s2=np.var(y)
n1 = len(x); n2 = len(y); p = 0.025
low = x_s2/y_s2*1/f.ppf(1-p, n1-1, n2-1)
up = x_s2/y_s2*1/f.ppf(p, n1-1, n2-1)

print([low,up])
[0.1421688673708112, 4.144622814076891]

单总体比例区间估计

Exp:随机变量x的转化率为15%,样本量为20,求总体转化率的95%置信区间

import numpy as np
import scipy.stats as ss

n = 60; p = 0.025; tr = 0.15 
low = tr-ss.norm.ppf(q=1-p)*np.sqrt(tr*(1-tr)/n)
up = tr+ss.norm.ppf(q=1-p)*np.sqrt(tr*(1-tr)/n)

print([low,up])
[0.05965012454920031, 0.24034987545079967]

总结

最常见的就是总体方差未知时,估计总体的均值u;总体服从二项分布,估计总体的比例p。如果遇到其他情形下的参数估计,同样只需要按照给定公式计算即可。

共勉~

  • 2
    点赞
  • 12
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值