python 拟合正态分布_如何用Scipy拟合对数正态分布?

I want to fit the log-normal parameters mu and sigma to an existing (measured) log-normal distribution.

The measured log-normal distribution is defined by the following x and y arrays:

x:

4.870000000000000760e-09

5.620000000000000859e-09

6.490000000000000543e-09

7.500000000000000984e-09

8.660000000000001114e-09

1.000000000000000021e-08

1.155000000000000085e-08

1.334000000000000067e-08

1.540000000000000224e-08

1.778000000000000105e-08

2.054000000000000062e-08

2.371000000000000188e-08

2.738000000000000099e-08

3.162000000000000124e-08

3.652000000000000541e-08

4.217000000000000637e-08

4.870000000000000595e-08

5.623000000000000125e-08

6.493999999999999784e-08

7.498999999999999850e-08

8.659999999999999460e-08

1.000000000000000087e-07

1.154800000000000123e-07

1.333500000000000129e-07

1.539900000000000177e-07

1.778300000000000247e-07

2.053499999999999958e-07

2.371399999999999913e-07

2.738399999999999692e-07

3.162300000000000199e-07

3.651700000000000333e-07

4.217000000000000240e-07

4.869700000000000784e-07

8.659600000000001124e-07

1.000000000000000167e-06

y:

1.883186407957446899e+11

3.609524622222222290e+11

7.508596384507042236e+11

2.226776878843930664e+12

4.845941940346821289e+12

7.979258430057803711e+12

1.101088735028901758e+13

1.346205871213872852e+13

1.509035024739884375e+13

1.599175638381502930e+13

1.668097844161849805e+13

1.786208191445086719e+13

2.007139089017341016e+13

2.346096336416185156e+13

2.763042850867051953e+13

3.177726578034682031e+13

3.552045143352600781e+13

3.858765218497110156e+13

4.051697248554913281e+13

4.132681209248554688e+13

4.112713068208092188e+13

4.003871248554913281e+13

3.797625966473988281e+13

3.472541513294797656e+13

3.017757826589595312e+13

2.454670317919075000e+13

1.840085110982658984e+13

1.250047161156069336e+13

7.540309609248554688e+12

3.912091102658959473e+12

1.632974141040462402e+12

4.585002890867052002e+11

1.260128910303030243e+11

7.276263267445255280e+09

1.120399584203921509e+10

Plotted this looks like this:

When I now use scipy.stats.lognorm.fit like this:

shape, loc, scale = stats.lognorm.fit(y, floc=0)

mu = np.log(scale)

sigma = shape

y_fit = 1 / x * 1 / (sigma * np.sqrt(2*np.pi)) * np.exp(-(np.log(x)-mu)**2/(2*sigma**2))

The resulting y_fit looks like this:

2.774453764650559735e-92

9.215468156399056736e-92

3.066511893903929907e-91

1.022335884325557513e-90

3.371353425505715432e-90

1.107869289600567113e-89

3.632923945686527959e-89

1.186352074527947499e-88

3.843439346384186221e-88

1.241282395050092616e-87

4.012158206798217088e-87

1.283531486148302474e-86

4.102813367932395623e-86

1.306865297124819703e-85

4.149188517768147925e-85

1.309743071360157226e-84

4.121819150664498056e-84

1.289935574540856462e-83

4.028475776631639341e-83

1.251854680594688466e-82

3.876254948575364474e-82

1.194751160823721531e-81

3.669411018320463915e-81

1.122061051084741563e-80

3.418224619543735425e-80

1.037398725542414359e-79

3.134554301786779178e-79

9.436770981828214504e-79

2.828745744939237710e-78

8.447588129217592353e-78

2.512030904806250195e-77

7.442222461482558402e-77

2.195666296758331429e-76

1.598228276801569301e-74

4.622033883255558750e-74

And is obliviously very far away from the original y values. I do realize that I haven't used the initial x values at all. So I assume I need to shift (and maybe also scale) the resulting distribution somehow.

However I can't wrap my head around how I need to do this. How do I correctly fit a log-normal distribution in Python?

解决方案

It works out of the box with curve_fit if you scale the data. I am not sure if scaling and re-scaling makes sense, though. (this seems to confirm the ansatz)

import matplotlib.pyplot as plt

import numpy as np

from scipy.optimize import curve_fit

def log_fit( x, a, mu, sigma ):

return a / x * 1. / (sigma * np.sqrt( 2. * np.pi ) ) * np.exp( -( np.log( x ) - mu )**2 / ( 2. * sigma**2 ) )

pp = np.argmax( y )

yM = y[ pp ]

xM = x[ pp ]

xR = x/xM

yR = y/yM

print xM, yM

sol, err = curve_fit( log_fit, xR, yR )

print sol

scaledSol = [ yM * sol[0] * xM , sol[1] + np.log(xM), sol[2] ]

print scaledSol

yF = np.fromiter( ( log_fit( xx, *sol ) for xx in xR ), np.float )

yFIR = np.fromiter( ( log_fit( xx, *scaledSol ) for xx in x ), np.float )

fig = plt.figure()

ax = fig.add_subplot( 2,1, 1)

bx = fig.add_subplot( 2,1, 2)

ax.plot( x, y )

ax.plot( x, yFIR )

bx.plot( xR, yR )

bx.plot( xR, yF )

plt.show()

Providing

>> 7.499e-08 41326812092485.55

>> [2.93003525 0.68436895 0.87481153]

>> [9080465.32138486, -15.72154211628693, 0.8748115349982701]

and

Anyhow, does not really look like that's the fit function.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值