python 趋势线计算式_Python-计算有错误的趋势线

So I've got some data stored as two lists, and plotted them using

plot(datasetx, datasety)

Then I set a trendline

trend = polyfit(datasetx, datasety)

trendx = []

trendy = []

for a in range(datasetx[0], (datasetx[-1]+1)):

trendx.append(a)

trendy.append(trend[0]*a**2 + trend[1]*a + trend[2])

plot(trendx, trendy)

But I have a third list of data, which is the error in the original datasety. I'm fine with plotting the errorbars, but what I don't know is using this, how to find the error in the coefficients of the polynomial trendline.

So say my trendline came out to be 5x^2 + 3x + 4 = y, there needs to be some sort of error on the 5, 3 and 4 values.

Is there a tool using NumPy that will calculate this for me?

解决方案

I think you can use the function curve_fit of scipy.optimize (documentation). A basic example of the usage:

import numpy as np

from scipy.optimize import curve_fit

def func(x, a, b, c):

return a*x**2 + b*x + c

x = np.linspace(0,4,50)

y = func(x, 5, 3, 4)

yn = y + 0.2*np.random.normal(size=len(x))

popt, pcov = curve_fit(func, x, yn)

Following the documentation, pcov gives:

The estimated covariance of popt. The diagonals provide the variance

of the parameter estimate.

So in this way you can calculate an error estimate on the coefficients. To have the standard deviation you can take the square root of the variance.

Now you have an error on the coefficients, but it is only based on the deviation between the ydata and the fit. In case you also want to account for an error on the ydata itself, the curve_fit function provides the sigma argument:

sigma : None or N-length sequence

If not None, it represents the standard-deviation of ydata. This

vector, if given, will be used as weights in the least-squares

problem.

A complete example:

import numpy as np

from scipy.optimize import curve_fit

def func(x, a, b, c):

return a*x**2 + b*x + c

x = np.linspace(0,4,20)

y = func(x, 5, 3, 4)

# generate noisy ydata

yn = y + 0.2 * y * np.random.normal(size=len(x))

# generate error on ydata

y_sigma = 0.2 * y * np.random.normal(size=len(x))

popt, pcov = curve_fit(func, x, yn, sigma = y_sigma)

# plot

import matplotlib.pyplot as plt

fig = plt.figure()

ax = fig.add_subplot(111)

ax.errorbar(x, yn, yerr = y_sigma, fmt = 'o')

ax.plot(x, np.polyval(popt, x), '-')

ax.text(0.5, 100, r"a = {0:.3f} +/- {1:.3f}".format(popt[0], pcov[0,0]**0.5))

ax.text(0.5, 90, r"b = {0:.3f} +/- {1:.3f}".format(popt[1], pcov[1,1]**0.5))

ax.text(0.5, 80, r"c = {0:.3f} +/- {1:.3f}".format(popt[2], pcov[2,2]**0.5))

ax.grid()

plt.show()

Then something else, about using numpy arrays. One of the main advantages of using numpy is that you can avoid for loops because operations on arrays apply elementwise. So the for-loop in your example can also be done as following:

trendx = arange(datasetx[0], (datasetx[-1]+1))

trendy = trend[0]*trendx**2 + trend[1]*trendx + trend[2]

Where I use arange instead of range as it returns a numpy array instead of a list.

In this case you can also use the numpy function polyval:

trendy = polyval(trend, trendx)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值