python计算概率_用Python计算分布中随机变量的概率

Given a mean and standard-deviation defining a normal distribution, how would you calculate the following probabilities in pure-Python (i.e. no Numpy/Scipy or other packages not in the standard library)?

The probability of a random variable r where r < x or r <= x.

The probability of a random variable r where r > x or r >= x.

The probability of a random variable r where x > r > y.

I've found some libraries, like Pgnumerics, that provide functions for calculating these, but the underlying math is unclear to me.

Edit: To show this isn't homework, posted below is my working code for Python<=2.6, albeit I'm not sure if it handles the boundary conditions correctly.

from math import *

import unittest

def erfcc(x):

"""

Complementary error function.

"""

z = abs(x)

t = 1. / (1. + 0.5*z)

r = t * exp(-z*z-1.26551223+t*(1.00002368+t*(.37409196+

t*(.09678418+t*(-.18628806+t*(.27886807+

t*(-1.13520398+t*(1.48851587+t*(-.82215223+

t*.17087277)))))))))

if (x >= 0.):

return r

else:

return 2. - r

def normcdf(x, mu, sigma):

t = x-mu;

y = 0.5*erfcc(-t/(sigma*sqrt(2.0)));

if y>1.0:

y = 1.0;

return y

def normpdf(x, mu, sigma):

u = (x-mu)/abs(sigma)

y = (1/(sqrt(2*pi)*abs(sigma)))*exp(-u*u/2)

return y

def normdist(x, mu, sigma, f):

if f:

y = normcdf(x,mu,sigma)

else:

y = normpdf(x,mu,sigma)

return y

def normrange(x1, x2, mu, sigma, f=True):

"""

Calculates probability of random variable falling between two points.

"""

p1 = normdist(x1, mu, sigma, f)

p2 = normdist(x2, mu, sigma, f)

return abs(p1-p2)

解决方案

All these are very similar: If you can compute #1 using a function cdf(x), then the solution to #2 is simply 1 - cdf(x), and for #3 it's cdf(x) - cdf(y).

Since Python includes the (gauss) error function built in since version 2.7 you can do this by calculating the cdf of the normal distribution using the equation from the article you linked to:

import math

print 0.5 * (1 + math.erf((x - mean)/math.sqrt(2 * standard_dev**2)))

where mean is the mean and standard_dev is the standard deviation.

Some notes since what you asked seemed relatively straightforward given the information in the article:

CDF of a random variable (say X) is the probability that X lies between -infinity and some limit, say x (lower case). CDF is the integral of the pdf for continuous distributions. The cdf is exactly what you described for #1, you want some normally distributed RV to be between -infinity and x (<= x).

< and <= as well as > and >= are same for continuous random variables as the probability that the rv is any single point is 0. So whether or not x itself is included doesn't actually matter when calculating the probabilities for continuous distributions.

Sum of probabilities is 1, if its not < x then it's >= x so if you have the cdf(x). then 1 - cdf(x) is the probability that the random variable X >= x. Since >= is equivalent for continuous random variables to >, this is also the probability X > x.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值