python 拟合正态分布_如何用Scipy拟合对数正态分布？

最新推荐文章于 2021-12-30 17:30:08 发布

weixin_39665847

最新推荐文章于 2021-12-30 17:30:08 发布

阅读量1.3k

点赞数

文章标签： python 拟合正态分布

博客讨论了如何在Python中使用`scipy.stats.lognorm.fit`函数来拟合对数正态分布。作者发现直接使用该函数得到的结果与原始数据相差甚远，并通过调整数据尺度，利用`curve_fit`实现更准确的拟合。代码示例展示了如何使用`curve_fit`和自定义函数`log_fit`进行拟合，并比较了拟合结果。

摘要由CSDN通过智能技术生成

I want to fit the log-normal parameters mu and sigma to an existing (measured) log-normal distribution.

The measured log-normal distribution is defined by the following x and y arrays:

4.870000000000000760e-09

5.620000000000000859e-09

6.490000000000000543e-09

7.500000000000000984e-09

8.660000000000001114e-09

1.000000000000000021e-08

1.155000000000000085e-08

1.334000000000000067e-08

1.540000000000000224e-08

1.778000000000000105e-08

2.054000000000000062e-08

2.371000000000000188e-08

2.738000000000000099e-08

3.162000000000000124e-08

3.652000000000000541e-08

4.217000000000000637e-08

4.870000000000000595e-08

5.623000000000000125e-08

6.493999999999999784e-08

7.498999999999999850e-08

8.659999999999999460e-08

1.000000000000000087e-07

1.154800000000000123e-07

1.333500000000000129e-07

1.539900000000000177e-07

1.778300000000000247e-07

2.053499999999999958e-07

2.371399999999999913e-07

2.738399999999999692e-07

3.162300000000000199e-07

3.651700000000000333e-07

4.217000000000000240e-07

4.869700000000000784e-07

8.659600000000001124e-07

1.000000000000000167e-06

1.883186407957446899e+11

3.609524622222222290e+11

7.508596384507042236e+11

2.226776878843930664e+12

4.845941940346821289e+12

7.979258430057803711e+12

1.101088735028901758e+13

1.346205871213872852e+13

1.509035024739884375e+13

1.599175638381502930e+13

1.668097844161849805e+13

1.786208191445086719e+13

2.007139089017341016e+13

2.346096336416185156e+13

2.763042850867051953e+13

3.177726578034682031e+13

3.552045143352600781e+13

3.858765218497110156e+13

4.051697248554913281e+13

4.132681209248554688e+13

4.112713068208092188e+13

4.003871248554913281e+13

3.797625966473988281e+13

3.472541513294797656e+13

3.017757826589595312e+13

2.454670317919075000e+13

1.840085110982658984e+13

1.250047161156069336e+13

7.540309609248554688e+12

3.912091102658959473e+12

1.632974141040462402e+12

4.585002890867052002e+11

1.260128910303030243e+11

7.276263267445255280e+09

1.120399584203921509e+10

Plotted this looks like this:

When I now use scipy.stats.lognorm.fit like this:

shape, loc, scale = stats.lognorm.fit(y, floc=0)

mu = np.log(scale)

sigma = shape

y_fit = 1 / x * 1 / (sigma * np.sqrt(2*np.pi)) * np.exp(-(np.log(x)-mu)**2/(2*sigma**2))

The resulting y_fit looks like this:

2.774453764650559735e-92

9.215468156399056736e-92

3.066511893903929907e-91

1.022335884325557513e-90

3.371353425505715432e-90

1.107869289600567113e-89

3.632923945686527959e-89

1.186352074527947499e-88

3.843439346384186221e-88

1.241282395050092616e-87

4.012158206798217088e-87

1.283531486148302474e-86

4.102813367932395623e-86

1.306865297124819703e-85

4.149188517768147925e-85

1.309743071360157226e-84

4.121819150664498056e-84

1.289935574540856462e-83

4.028475776631639341e-83

1.251854680594688466e-82

3.876254948575364474e-82

1.194751160823721531e-81

3.669411018320463915e-81

1.122061051084741563e-80

3.418224619543735425e-80

1.037398725542414359e-79

3.134554301786779178e-79

9.436770981828214504e-79

2.828745744939237710e-78

8.447588129217592353e-78

2.512030904806250195e-77

7.442222461482558402e-77

2.195666296758331429e-76

1.598228276801569301e-74

4.622033883255558750e-74

And is obliviously very far away from the original y values. I do realize that I haven't used the initial x values at all. So I assume I need to shift (and maybe also scale) the resulting distribution somehow.

However I can't wrap my head around how I need to do this. How do I correctly fit a log-normal distribution in Python?

解决方案

It works out of the box with curve_fit if you scale the data. I am not sure if scaling and re-scaling makes sense, though. (this seems to confirm the ansatz)

import matplotlib.pyplot as plt

import numpy as np

from scipy.optimize import curve_fit

def log_fit( x, a, mu, sigma ):

return a / x * 1. / (sigma * np.sqrt( 2. * np.pi ) ) * np.exp( -( np.log( x ) - mu )**2 / ( 2. * sigma**2 ) )

pp = np.argmax( y )

yM = y[ pp ]

xM = x[ pp ]

xR = x/xM

yR = y/yM

print xM, yM

sol, err = curve_fit( log_fit, xR, yR )

print sol

scaledSol = [ yM * sol[0] * xM , sol[1] + np.log(xM), sol[2] ]

print scaledSol

yF = np.fromiter( ( log_fit( xx, *sol ) for xx in xR ), np.float )

yFIR = np.fromiter( ( log_fit( xx, *scaledSol ) for xx in x ), np.float )

fig = plt.figure()

ax = fig.add_subplot( 2,1, 1)

bx = fig.add_subplot( 2,1, 2)

ax.plot( x, y )

ax.plot( x, yFIR )

bx.plot( xR, yR )

bx.plot( xR, yF )

plt.show()

Providing

>> 7.499e-08 41326812092485.55

>> [2.93003525 0.68436895 0.87481153]

>> [9080465.32138486, -15.72154211628693, 0.8748115349982701]

and

Anyhow, does not really look like that's the fit function.

weixin_39665847

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python 拟合正态分布_如何用Scipy拟合对数正态分布？

I want to fit the log-normal parameters mu and sigma to an existing (measured) log-normal distribution.The measured log-normal distribution is defined by the following x and y arrays:x:4.8700000000000...
复制链接

扫一扫