I have the following data for x and y:
x y
1.71 0.0
1.76 5.0
1.81 10.0
1.86 15.0
1.93 20.0
2.01 25.0
2.09 30.0
2.20 35.0
2.32 40.0
2.47 45.0
2.65 50.0
2.87 55.0
3.16 60.0
3.53 65.0
4.02 70.0
4.69 75.0
5.64 80.0
7.07 85.0
9.35 90.0
13.34 95.0
21.43 100.0
For the above data, I am trying to fit the data in the form:
However, there are certain uncertainties associated with x and y, where x has uncertainty of 50% of x and y has a fixed uncertainty. I am trying to determine the uncertainty in the fit parameters with this uncertainties package. But, I am having issues with curve fitting with scipy optimize's curve fit function. I get the following error:
minpack.error: Result from function call is not a proper array of
floats.
How do I fix the following error and determine the uncertainty of the fit parameters (a,b and n)?
MWE
from __future__ import division
import numpy as np
import re
from scipy import optimize, interpolate, spatial
from scipy.interpolate import UnivariateSpline
from uncertainties import unumpy
def linear_fit(x, a, b):
return a * x + b
uncertainty = 0.5
y_error = 1.2
x = np.array([1.71, 1.76, 1.81, 1.86, 1.93, 2.01, 2.09, 2.20, 2.32, 2.47, 2.65, 2.87, 3.16, 3.53, 4.02, 4.69, 5.64, 7.07, 9.35, 13.34, 21.43])
x_uncertainty = x * uncertainty
x = unumpy.uarray(x, x_uncertainty)
y = np.array([0.0, 5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0, 45.0, 50.0, 55.0, 60.0, 65.0, 70.0, 75.0, 80.0, 85.0, 90.0, 95.0, 100.0])
y = unumpy.uarray(y, y_error)
n = np.arange(0, 5, 0.005)
coefficient_determination_on = np.empty(shape = (len(n),))
for j in range(len(n)):
n_correlation = n[j]
x_fit = 1 / ((x) ** n_correlation)
y_fit = y
fit_a_raw, fit_b_raw = optimize.curve_fit(linear_fit, x_fit, y_fit)[0]
x_prediction = (fit_a_raw / ((x) ** n_correlation)) + fit_b_raw
y_residual_squares = np.sum((x_prediction - y) ** 2)
y_total_squares = np.sum((y - np.mean(y)) ** 2)
coefficient_determination_on[j] = 1 - (y_residual_squares / y_total_squares)
解决方案
Let me first preface this with this problem being impossible to solve "nicely" given that you want to solve for a, b and n. This is because for a fixed n, your problem admits a closed form solution, while if you let n be free, it does not, and in fact the problem may have multiple solutions. Hence classical error analysis (such as that used by uncertanities) breaks down and you have to resort to other methods.
The case n fixed
If n is fixed, your problem is that the libraries you call do not support uarray, so you have to make a workaround. Thankfully, linear fitting (under the l2-distance) is simply Linear least squares which admits a closed form solution, requiring only padding the values with ones and then solving the normal equations.
Where:
You could do it like this:
import numpy as np
from uncertainties import unumpy
uncertainty = 0.5
y_error = 1.2
n = 1.0
# Define x and y
x = np.array([1.71, 1.76, 1.81, 1.86, 1.93, 2.01, 2.09, 2.20, 2.32, 2.47, 2.65,
2.87, 3.16, 3.53, 4.02, 4.69, 5.64, 7.07, 9.35, 13.34, 21.43])
# Take power of x values according to n
x_pow = x ** n
x_uncertainty = x_pow * uncertainty
x_fit = unumpy.uarray(np.c_[x_pow, np.ones_like(x)],
np.c_[x_uncertainty, np.zeros_like(x_uncertainty)])
y = np.array([0.0, 5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0, 45.0, 50.0,
55.0, 60.0, 65.0, 70.0, 75.0, 80.0, 85.0, 90.0, 95.0, 100.0])
y_fit = unumpy.uarray(y, y_error)
# Use normal equations to find coefficients
inv_mat = unumpy.ulinalg.pinv(x_fit.T.dot(x_fit))
fit_a, fit_b = inv_mat.dot(x_fit.T.dot(y_fit))
print('fit_a={}, fit_b={}'.format(fit_a, fit_b))
Result:
fit_a=4.8+/-2.6, fit_b=28+/-10
The case n unknown
With n unknown, you really are in some trouble since the problem is non-convex. Here, linear error analysis (as performed by uncertainties) will not work.
One solution is to perform Bayesian inference, using some package like pymc. If you are interested in this, I could try to make a writeup, but it would not be as clean as above.