python lognorm_Python lognorm.cdf与基于公式的实现不匹配(Python lognorm.cdf vs. formula based implementation not...

本文探讨了在将scipy.stats.lognorm.cdf转换为Cython函数时遇到的问题,发现使用公式实现的结果与scipy库不匹配。经过修正后,通过比较两个函数的输出差异,发现差异极小,仅为1.2011928779531548e-15,表明两者已基本一致。
摘要由CSDN通过智能技术生成

Python lognorm.cdf与基于公式的实现不匹配(Python lognorm.cdf vs. formula based implementation not matching)

好吧,我正在将scipy.stats.lognorm.cdf函数转换为Cython函数并使用此处的公式: http : //www.cs.unitn.it/~taufer/SR/P-LN.pdf为1/2 + 1/2 * erf((ln(x)-mu)/ sigma * sqrt(2)。结果不匹配,尽管在线有很多其他同类函数的引用。编辑:刚修好,只需要做np .log(mu)2x ...固定代码:

import numpy as np

from scipy.stats import lognorm

from scipy.special import erf

def lognormcdf(x, mu, sigma):

return 0.5 + 0.5*erf((np.log(x)-np.log(mu))/(np.sqrt(2.0)*sigma))

mu = 3.85

sigma = 0.346

x = [-9.997137267734412802e-01,-9.984919506395958377e-01,-9.962951347331251428e-01,-9.931249370374434227e-01,-9.889843952429917540e-01,-9.838775407060570410e-01,-9.778093584869183008e-01,-9.707857757637063933e-01,-9.628136542558155542e-01,-9.539007829254917414e-01,-9.440558701362560257e-01,-9.332885350430795146e-01,-9.216092981453339883e-01,-9.090295709825296777e-01,-8.955616449707269888e-01,-8.812186793850184108e-01,-8.660146884971646752e-01,-8.499645278795913139e-01,-8.330838798884008245e-01,-8.153892383391762033e-01,-7.968978923903144995e-01,-7.776279096494954635e-01,-7.575981185197071532e-01,-7.368280898020207470e-01,-7.153381175730564312e-01,-6.931491993558019926e-01,-6.702830156031409636e-01,-6.467619085141292912e-01,-6.226088602037077591e-01,-5.978474702471787694e-01,-5.725019326213811599e-01,-5.465970120650941455e-01,-5.201580198817630230e-01,-4.932107892081909473e-01,-4.657816497733580086e-01,-4.378974021720314913e-01,-4.095852916783015440e-01,-3.808729816246299582e-01,-3.517885263724216949e-01,-3.223603439005291449e-01,-2.926171880384719759e-01,-2.625881203715034751e-01,-2.323024818449739570e-01,-2.017898640957360157e-01,-1.710800805386032686e-01,-1.402031372361139672e-01,-1.091892035800611088e-01,-7.806858281343663497e-02,-4.687168242159163445e-02,-1.562898442154308370e-02,1.562898442154308370e-02,4.687168242159163445e-02,7.806858281343663497e-02,1.091892035800611088e-01,1.402031372361139672e-01,1.710800805386032686e-01,.017898640957360157e-01,2.323024818449739570e-01,2.625881203715034751e-01,2.926171880384719759e-01,3.223603439005291449e-01,3.517885263724216949e-01,3.808729816246299582e-01,4.095852916783015440e-01,4.378974021720314913e-01,4.657816497733580086e-01,4.932107892081909473e-01,5.201580198817630230e-01,5.465970120650941455e-01,5.725019326213811599e-01,5.978474702471787694e-01,6.226088602037077591e-01,6.467619085141292912e-01,6.702830156031409636e-01,6.931491993558019926e-01,7.153381175730564312e-01,7.368280898020207470e-01,7.575981185197071532e-01,7.776279096494954635e-01,7.968978923903144995e-01,8.153892383391762033e-01,8.330838798884008245e-01,8.499645278795913139e-01,8.660146884971646752e-01,8.812186793850184108e-01,8.955616449707269888e-01,9.090295709825296777e-01,9.216092981453339883e-01,9.332885350430795146e-01,9.440558701362560257e-01,9.539007829254917414e-01,9.628136542558155542e-01,9.707857757637063933e-01,9.778093584869183008e-01,9.838775407060570410e-01,9.889843952429917540e-01,9.931249370374434227e-01,9.962951347331251428e-01,9.984919506395958377e-01,9.997137267734412802e-01]

mycdf = lognormcdf(x, np.log(mu), sigma)

scipycdf = lognorm.cdf(x, scale=np.log(mu), s=sigma)

# This line comparing the Scipy function and mine displays the results below

np.sum(np.nan_to_num(mycdf)-scipycdf)

结果:

1.2011928779531548e-15

Okay I am converting the scipy.stats.lognorm.cdf function over to a Cython function and using the formula here: http://www.cs.unitn.it/~taufer/SR/P-LN.pdf as 1/2 + 1/2* erf((ln(x)-mu)/sigma*sqrt(2). The results don't match, despite many other references to the same function online. EDIT: just fixed, only had to do np.log(mu) 2x ... Fixed code:

import numpy as np

from scipy.stats import lognorm

from scipy.special import erf

def lognormcdf(x, mu, sigma):

return 0.5 + 0.5*erf((np.log(x)-np.log(mu))/(np.sqrt(2.0)*sigma))

mu = 3.85

sigma = 0.346

x = [-9.997137267734412802e-01,-9.984919506395958377e-01,-9.962951347331251428e-01,-9.931249370374434227e-01,-9.889843952429917540e-01,-9.838775407060570410e-01,-9.778093584869183008e-01,-9.707857757637063933e-01,-9.628136542558155542e-01,-9.539007829254917414e-01,-9.440558701362560257e-01,-9.332885350430795146e-01,-9.216092981453339883e-01,-9.090295709825296777e-01,-8.955616449707269888e-01,-8.812186793850184108e-01,-8.660146884971646752e-01,-8.499645278795913139e-01,-8.330838798884008245e-01,-8.153892383391762033e-01,-7.968978923903144995e-01,-7.776279096494954635e-01,-7.575981185197071532e-01,-7.368280898020207470e-01,-7.153381175730564312e-01,-6.931491993558019926e-01,-6.702830156031409636e-01,-6.467619085141292912e-01,-6.226088602037077591e-01,-5.978474702471787694e-01,-5.725019326213811599e-01,-5.465970120650941455e-01,-5.201580198817630230e-01,-4.932107892081909473e-01,-4.657816497733580086e-01,-4.378974021720314913e-01,-4.095852916783015440e-01,-3.808729816246299582e-01,-3.517885263724216949e-01,-3.223603439005291449e-01,-2.926171880384719759e-01,-2.625881203715034751e-01,-2.323024818449739570e-01,-2.017898640957360157e-01,-1.710800805386032686e-01,-1.402031372361139672e-01,-1.091892035800611088e-01,-7.806858281343663497e-02,-4.687168242159163445e-02,-1.562898442154308370e-02,1.562898442154308370e-02,4.687168242159163445e-02,7.806858281343663497e-02,1.091892035800611088e-01,1.402031372361139672e-01,1.710800805386032686e-01,.017898640957360157e-01,2.323024818449739570e-01,2.625881203715034751e-01,2.926171880384719759e-01,3.223603439005291449e-01,3.517885263724216949e-01,3.808729816246299582e-01,4.095852916783015440e-01,4.378974021720314913e-01,4.657816497733580086e-01,4.932107892081909473e-01,5.201580198817630230e-01,5.465970120650941455e-01,5.725019326213811599e-01,5.978474702471787694e-01,6.226088602037077591e-01,6.467619085141292912e-01,6.702830156031409636e-01,6.931491993558019926e-01,7.153381175730564312e-01,7.368280898020207470e-01,7.575981185197071532e-01,7.776279096494954635e-01,7.968978923903144995e-01,8.153892383391762033e-01,8.330838798884008245e-01,8.499645278795913139e-01,8.660146884971646752e-01,8.812186793850184108e-01,8.955616449707269888e-01,9.090295709825296777e-01,9.216092981453339883e-01,9.332885350430795146e-01,9.440558701362560257e-01,9.539007829254917414e-01,9.628136542558155542e-01,9.707857757637063933e-01,9.778093584869183008e-01,9.838775407060570410e-01,9.889843952429917540e-01,9.931249370374434227e-01,9.962951347331251428e-01,9.984919506395958377e-01,9.997137267734412802e-01]

mycdf = lognormcdf(x, np.log(mu), sigma)

scipycdf = lognorm.cdf(x, scale=np.log(mu), s=sigma)

# This line comparing the Scipy function and mine displays the results below

np.sum(np.nan_to_num(mycdf)-scipycdf)

Results:

1.2011928779531548e-15

原文:https://stackoverflow.com/questions/37597348

更新时间:2019-12-13 10:00

最满意答案

编辑原始帖子以反映正确的公式。

def lognormcdf(x, mu, sigma):

return 0.5 + 0.5*erf((np.log(x)-np.log(mu))/(np.sqrt(2.0)*sigma))

将np.log(mu)传递给mu ,它可以工作。

The original post was edited to reflect the correct formula.

def lognormcdf(x, mu, sigma):

return 0.5 + 0.5*erf((np.log(x)-np.log(mu))/(np.sqrt(2.0)*sigma))

Pass np.log(mu) in for mu and it works.

2016-06-06

相关问答

import numpy as np

import seaborn as sns

x = np.random.randn(200)

sns.distplot(x,

hist_kws=dict(cumulative=True),

kde_kws=dict(cumulative=True))

import numpy as np

import seaborn as sns

x = np.random.randn(200)

sns.distplo

...

看起来(几乎)正是你想要的。 两件事情: 首先,结果是四项的元组。 第三个是箱子的大小。 第二个是最小仓的起点。 第一个是每个仓内或之下的点数。 (最后一个是超出限制的点数,但是由于您尚未设置任何值,所有点都将被合并。) 其次,您需要重新调整结果,以便最终的值为1,遵循CDF的常规惯例,否则是正确的。 这是它的作用: def cumfreq(a, numbins=10, defaultreallimits=None):

# docstring omitted

h,l,b,e = h

...

使用这个数组公式: =IFERROR(INDEX(executives,MATCH(1,(COUNTIF($I$2:I2,executives)=0)*(exit_date="On Duty")*(ISERROR(MATCH(executives,evaluated_name,0))),0)),"")

输出列表必须位于第三行。 将I更改为您想要输出的列。 如果需要不同的起始行,则将第2行更改为第一个公式上方的行,即使它是标题行。 作为数组公式,在退出编辑模式时,需要使用Ctrl-Shift-En

...

这里有大约一百个术语问题,主要围绕某人(不是你)试图让自己的想法听起来像“最佳”。 所有面向对象的语言都需要处理几个概念: 将数据与关联操作一起封装在数据上,不同地称为数据成员和成员函数,或者作为数据和方法等等。 继承,能够说这些对象就像其他的一组对象,除了这些改变之外 多态(“许多形状”),其中对象决定要运行哪些方法,以便您可以依赖于语言来正确路由请求。 现在,至于比较: 第一件事是整个“班级”与“原型”问题。 这个想法最初在Simula中开始,在这里使用基于类的方法,每个类表示一组共享相同状态

...

这是你如何做到的,我用10替换了10000,因为它需要一段时间。 我的初始猜测只是0,我将其设置为上一次迭代以进行下一次猜测,因为它应该非常接近解决方案。 如果你愿意,你可以进一步限制它,因为它严格高于它。 作为旁注,这种复杂分布的采样并不可行,因为计算cdf可能相当困难。 还有其他采样技术可以解决这些问题,例如Gibbs采样,Metropolis Hastings等。 var = 100

def f(x, a):

def g(y):

return (1/np.sqrt(

...

如果您具有Metalink访问权限,则可以在客户端 - 服务器互操作性上检查MOS说明207303.1 。 总之,完全支持客户11.2.0及更高版本; 支持11.1.0,但错误修正仅适用于具有扩展支持的客户。 If you have Metalink access, you can check MOS note 207303.1 on Client-server interoperability. In summary, Clients 11.2.0 and above are fully sup

...

编辑原始帖子以反映正确的公式。 def lognormcdf(x, mu, sigma):

return 0.5 + 0.5*erf((np.log(x)-np.log(mu))/(np.sqrt(2.0)*sigma))

将np.log(mu)传递给mu ,它可以工作。 The original post was edited to reflect the correct formula. def lognormcdf(x, mu, sigma):

return 0.5 + 0

...

你有两个选择: 1:你可以先将数据装箱。 这可以通过numpy.histogram函数轻松完成: import numpy as np

import matplotlib.pyplot as plt

data = np.loadtxt('Filename.txt')

# Choose how many bins you want here

num_bins = 20

# Use the histogram function to bin the data

counts, bin_edges

...

scipy lognorm分布的scale参数是exp(mean) ,其中mean是基础正态分布的平均值。 所以你应该写: scale = np.exp(mean)

这是一个生成类似Excel图的图表的脚本: import numpy as np

from scipy.stats import lognorm

import matplotlib.pyplot as plt

shape = 0.560774853

scale = np.exp(4.630495093)

loc = 0

dist

...

获得样本后,您可以使用np.unique *和np.cumsum的组合轻松计算ECDF: import numpy as np

def ecdf(sample):

# convert sample to a numpy array, if it isn't already

sample = np.atleast_1d(sample)

# find the unique values and their corresponding counts

quantile

...

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值