python正态分布代码_Tests for normality正态分布检验(python代码实现)

微信公众号:pythonEducation

模型和统计项目QQ:231469242

目录:

1.Shapiro-Wilk test

样本量小于50

2.normaltest

样本量小于50, normaltest运用了D’Agostino–Pearson综合测试法,每组样本数大于20

3.Lilliefors-test

- for intermediate sample numbers, the Lilliefors-test is good since the original Kolmogorov-Smirnov-test is unreliable when mean and std of the distribution are not known.

4.Kolmogorov-Smirnov(Kolmogorov-Smirnov) test

- the Kolmogorov-Smirnov(Kolmogorov-Smirnov) test should only be used for large sample numbers (>300)

最新版本代码

# -*- coding: utf-8 -*-

'''

Author:Toby

QQ:231469242,all right reversed,no commercial use

微信公众号:pythonEducation

'''

import scipy

from scipy.stats import f

import numpy as np

import matplotlib.pyplot as plt

import scipy.stats as stats

# additional packages

from statsmodels.stats.diagnostic import lillifors

group1=[2,3,7,2,6]

group2=[10,8,7,5,10]

group3=[10,13,14,13,15]

list_groups=[group1,group2,group3]

list_total=group1+group2+group3

#正态分布测试

def check_normality(testData):

#20

if 20

p_value= stats.normaltest(testData)[1]

if p_value<0.05:

print"use normaltest"

print "data are not normal distributed"

return False

else:

print"use normaltest"

print "data are normal distributed"

return True

#样本数小于50用Shapiro-Wilk算法检验正态分布性

if len(testData) <50:

p_value= stats.shapiro(testData)[1]

if p_value<0.05:

print "use shapiro:"

print "data are not normal distributed"

return False

else:

print "use shapiro:"

print "data are normal distributed"

return True

if 300>=len(testData) >=50:

p_value= lillifors(testData)[1]

if p_value<0.05:

print "use lillifors:"

print "data are not normal distributed"

return False

else:

print "use lillifors:"

print "data are normal distributed"

return True

if len(testData) >300:

p_value= stats.kstest(testData,'norm')[1]

if p_value<0.05:

print "use kstest:"

print "data are not normal distributed"

return False

else:

print "use kstest:"

print "data are normal distributed"

return True

#对所有样本组进行正态性检验

def NormalTest(list_groups):

for group in list_groups:

#正态性检验

status=check_normality(group1)

if status==False :

return False

#对所有样本组进行正态性检验

NormalTest(list_groups)

pp-plot和qq-plot结论都很类似。如果数据服从正太分布,生成的点会很好依附在y=x直线上

In all three cases the results are similar: if the two distributions being compared

are similar, the points will approximately lie on the line y D x. If the distributions

are linearly related, the points will approximately lie on a line, but not necessarily

on the line y D x (Fig. 7.1).

In Python, a probability plot can be generated with the command

stats.probplot(data, plot=plt)

https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.probplot.html

a) Probability-Plots

用于可视化评估分布,绘制分位点来比较概率分布

sample quantilies是你的样本原始的数据

sample distribution

In statistics different tools are available for the visual assessments of distributions.

A number of graphical methods exist for comparing two probability distributions by plotting their quantiles, or closely related parameters, against each other:

# -*- coding: utf-8 -*-

import numpy as np

import pylab

import scipy.stats as stats

measurements = np.random.normal(loc = 20, scale = 5, size=100)

stats.probplot(measurements, dist="norm", plot=pylab)

pylab.show()

7.1 Probability-plot, to

check for normality of a

由于随机产生的100个正态分布点,测试其正太性。概率图显示100个点很好落在y=x直线附近,所以这些数据有很好正态性。

QQPlot(quantile quantile plot)

http://baike.baidu.com/link?url=o9Z7vr6VdvGAtTRO3RYxQbVu56U_XDaSdibPeVcidMJQ7B6LcAUBHcIro4tLf5BSI5Pu-59W4SPNZ-zRFJ8_FgL3dxJLaUdY0JiB2xUmqie

QQPlot图是用于直观验证一组数据是否来自某个分布,或者验证某两组数

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值