numpy np.polyfit()（最小二乘多项式拟合曲线）（有待进一步研究）

最新推荐文章于 2024-07-28 13:30:07 发布

Dontla

最新推荐文章于 2024-07-28 13:30:07 发布

阅读量7.3k

点赞数 1

分类专栏： numpy

本文链接：https://blog.csdn.net/Dontla/article/details/104477531

版权

numpy 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

from numpy\lib\polynomial.py

@array_function_dispatch(_polyfit_dispatcher)
def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False):
    """
    Least squares polynomial fit.
    最小二乘多项式拟合。

    Fit a polynomial ``p(x) = p[0] * x**deg + ... + p[deg]`` of degree `deg` to points `(x, y)`. Returns a vector of coefficients `p` that minimises the squared error in the order `deg`, `deg-1`, ... `0`.
    将度数``deg''的多项式``p（x）= p [0] * x ** deg + ... + p°''拟合到点`（x，y）`。 返回系数p的向量，该向量按deg，deg-1，... 0的顺序最小化平方误差。

    The `Polynomial.fit <numpy.polynomial.polynomial.Polynomial.fit>` class method is recommended for new code as it is more stable numerically. See the documentation of the method for more information.
    对于新代码，建议使用Polynomial.fit <numpy.polynomial.polynomial.Polynomial.fit>类方法，因为它在数值上更稳定。 有关更多信息，请参见该方法的文档。

    Parameters
    ----------
    x : array_like, shape (M,)
        x-coordinates of the M sample points ``(x[i], y[i])``.
        M个采样点的``（x [i]，y [i]）''的x坐标。
    y : array_like, shape (M,) or (M, K)
        y-coordinates of the sample points. Several data sets of sample points sharing the same x-coordinates can be fitted at once by passing in a 2D-array that contains one dataset per column.
        采样点的y坐标。 可以通过传入每列包含一个数据集的2D数组，一次拟合几个共享相同x坐标的采样点数据集。
    deg : int
        Degree of the fitting polynomial
        拟合多项式的度（阶数？）
    rcond : float, optional
        Relative condition number of the fit. Singular values smaller than this relative to the largest singular value will be ignored. The default value is len(x)*eps, where eps is the relative precision of the float type, about 2e-16 in most cases.
        拟合的相对条件编号。 相对于最大奇异值，小于此奇异值的将被忽略。 默认值为len（x）* eps，其中eps是float类型的相对精度，在大多数情况下约为2e-16。
    full : bool, optional
        Switch determining nature of return value. When it is False (the default) just the coefficients are returned, when True diagnostic information from the singular value decomposition is also returned.
        切换确定返回值的性质。 如果为False（默认值），则仅返回系数；当设置为True还返回来自奇异值分解的True诊断信息时。
    w : array_like, shape (M,), optional
        Weights to apply to the y-coordinates of the sample points. For gaussian uncertainties, use 1/sigma (not 1/sigma**2).
        应用于采样点的y坐标的权重。 对于高斯不确定性，请使用1 / sigma（而不是1 / sigma ** 2）。
    cov : bool or str, optional
        If given and not `False`, return not just the estimate but also its covariance matrix. By default, the covariance are scaled by chi2/sqrt(N-dof), i.e., the weights are presumed to be unreliable except in a relative sense and everything is scaled such that the reduced chi2 is unity. This scaling is omitted if ``cov='unscaled'``, as is relevant for the case that the weights are 1/sigma**2, with sigma known to be a reliable estimate of the uncertainty.
        如果给出且不是“ False”，则不仅返回估计值，还返回其协方差矩阵。 默认情况下，协方差由chi2 / sqrt（N-dof）缩放，即除相对意义上的权重被假定为不可靠的，并且一切都缩放以使减少的chi2统一。 如果``cov ='unscaled'''，则忽略此缩放比例，这与权重为1 / sigma ** 2的情况有关，已知sigma是不确定性的可靠估计。

    Returns
    -------
    p : ndarray, shape (deg + 1,) or (deg + 1, K)
        Polynomial coefficients, highest power first.  If `y` was 2-D, the
        coefficients for `k`-th data set are in ``p[:,k]``.

    residuals, rank, singular_values, rcond
        Present only if `full` = True.  Residuals is sum of squared residuals
        of the least-squares fit, the effective rank of the scaled Vandermonde
        coefficient matrix, its singular values, and the specified value of
        `rcond`. For more details, see `linalg.lstsq`.

    V : ndarray, shape (M,M) or (M,M,K)
        Present only if `full` = False and `cov`=True.  The covariance
        matrix of the polynomial coefficient estimates.  The diagonal of
        this matrix are the variance estimates for each coefficient.  If y
        is a 2-D array, then the covariance matrix for the `k`-th data set
        are in ``V[:,:,k]``


    Warns
    -----
    RankWarning
        The rank of the coefficient matrix in the least-squares fit is
        deficient. The warning is only raised if `full` = False.
        最小二乘拟合中的系数矩阵的秩不足。 仅当“ full” = False时才发出警告。

        The warnings can be turned off by

        >>> import warnings
        >>> warnings.simplefilter('ignore', np.RankWarning)

    See Also
    --------
    polyval : Compute polynomial values.
    计算多项式值。
    linalg.lstsq : Computes a least-squares fit.
    计算最小二乘拟合。
    scipy.interpolate.UnivariateSpline : Computes spline fits.
    计算样条拟合。

    Notes
    -----
    The solution minimizes the squared error

    .. math ::
        E = \\sum_{j=0}^k |p(x_j) - y_j|^2

    in the equations::

        x[0]**n * p[0] + ... + x[0] * p[n-1] + p[n] = y[0]
        x[1]**n * p[0] + ... + x[1] * p[n-1] + p[n] = y[1]
        ...
        x[k]**n * p[0] + ... + x[k] * p[n-1] + p[n] = y[k]

    The coefficient matrix of the coefficients `p` is a Vandermonde matrix.
    系数“ p”的系数矩阵是范德蒙德矩阵。

    `polyfit` issues a `RankWarning` when the least-squares fit is badly conditioned. This implies that the best fit is not well-defined due to numerical error. The results may be improved by lowering the polynomial degree or by replacing `x` by `x` - `x`.mean(). The `rcond` parameter can also be set to a value smaller than its default, but the resulting fit may be spurious: including contributions from the small singular values can add numerical noise to the result.
    当最小二乘拟合条件不好时，`polyfit`会发出“ RankWarning”。 这意味着由于数值误差，最佳拟合的定义不明确。 通过降低多项式次数或将`x`替换为`x`-`x`.mean（）可以改善结果。 rcond参数也可以设置为小于其默认值的值，但是结果拟合可能是虚假的：包括小的奇异值的贡献会在结果中增加数值噪声。

    Note that fitting polynomial coefficients is inherently badly conditioned when the degree of the polynomial is large or the interval of sample points is badly centered. The quality of the fit should always be checked in these cases. When polynomial fits are not satisfactory, splines may be a good alternative.
    注意，当多项式的阶数较大或采样点的间隔严重居中时，拟合多项式系数固有地条件不好。 在这种情况下，应始终检查配合质量。 当多项式拟合不令人满意时，样条线可能是不错的选择。

    References
    ----------
    .. [1] Wikipedia, "Curve fitting",
           https://en.wikipedia.org/wiki/Curve_fitting
    .. [2] Wikipedia, "Polynomial interpolation",
           https://en.wikipedia.org/wiki/Polynomial_interpolation

    Examples
    --------
    >>> import warnings
    >>> x = np.array([0.0, 1.0, 2.0, 3.0,  4.0,  5.0])
    >>> y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0])
    >>> z = np.polyfit(x, y, 3)
    >>> z
    array([ 0.08703704, -0.81349206,  1.69312169, -0.03968254]) # may vary

    It is convenient to use `poly1d` objects for dealing with polynomials:

    >>> p = np.poly1d(z)
    >>> p(0.5)
    0.6143849206349179 # may vary
    >>> p(3.5)
    -0.34732142857143039 # may vary
    >>> p(10)
    22.579365079365115 # may vary

    High-order polynomials may oscillate wildly:

    >>> with warnings.catch_warnings():
    ...     warnings.simplefilter('ignore', np.RankWarning)
    ...     p30 = np.poly1d(np.polyfit(x, y, 30))
    ...
    >>> p30(4)
    -0.80000000000000204 # may vary
    >>> p30(5)
    -0.99999999999999445 # may vary
    >>> p30(4.5)
    -0.10547061179440398 # may vary

    Illustration:

    >>> import matplotlib.pyplot as plt
    >>> xp = np.linspace(-2, 6, 100)
    >>> _ = plt.plot(x, y, '.', xp, p(xp), '-', xp, p30(xp), '--')
    >>> plt.ylim(-2,2)
    (-2, 2)
    >>> plt.show()

    """

示例

# -*- coding: utf-8 -*-
"""
@File    : plot.py
@Time    : 2020/2/24 8:55
@Author  : Dontla
@Email   : sxana@qq.com
@Software: PyCharm
"""

import matplotlib.pyplot as plt

import numpy as np


# 如发现格式不对可用记事本或notepad批量替换
keyword = {'11:30.0': (50000, 13.96), '12:16.0': (54500, 13.20), '13:15.0': (47500, 12.48),
           '14:22.0': (55450, 12.44), '14:35.0': (55430, 13.72), '17:03.0': (13990, 11.00),
           '17:38.0': (9058, 11.60), '17:57.0': (5044, 12.46), '18:20.0': (1300, 13.80),
           '18:25.0': (900, 13.90), '18:28.0': (700, 13.96), '18:40.0': (200, 13.34),
           '18:42.0': (150, 13.10), '18:44.0': (100, 11.80), '18:44.2': (90, 11.34),
           '18.44.4': (80, 11.38), '18:44.8': (70, 9.50), '18:45.0': (60, 9.20),
           '18:46.0': (50, 11.9), '18:46.3': (40, 10.8), '18:46.6': (30, 9.20),
           '18:49.0': (20, 9.70), '18:49.6': (15, 6.90), '18:50.3': (13, 4.70),
           '18:50.9': (12, 3.80), '18:51.5': (11, 2.60), '18:52.2': (10, 1.70),
           '18:52.9': (9, 1.00), '18:53.6': (8, 0.2), '18:54.3': (7, 0.06),
           '18:55.0': (6, 0.02)}

data = []

for key in keyword:
    data.append(keyword[key])

x = np.array(data)[:, 0]
y = np.array(data)[:, 1]

# 用3次多项式拟合  可以改为5 次多项式。。。。 返回三次多项式系数
z1 = np.polyfit(x, y, 10)
p1 = np.poly1d(z1)

# 在屏幕上打印拟合多项式
print(p1)
#            3             2
# 2.534e-13 x - 2.506e-08 x + 0.000714 x + 7.821

yvals = p1(x)  # 也可以使用yvals=np.polyval(z1,x)

plot1 = plt.plot(x, y, '*', label='original values')
plot2 = plt.plot(x, yvals, 'r', label='polyfit values')

plt.xlabel('xaxis')

plt.ylabel('yaxis')

plt.legend(loc=4)  # 指定legend的位置,读者可以自己help它的用法

plt.title('polyfitting')

plt.show()

plt.savefig('p1.png')

结果：

D:\Yolov3_Tensorflow\python\python.exe C:/Users/HuaWei/Desktop/绘制不同光照条件下识别率曲线图/plot.py
            10             9             8             7             6
-3.045e-40 x  + 7.957e-35 x - 8.616e-30 x + 4.993e-25 x - 1.672e-20 x
              5             4             3             2
 + 3.273e-16 x - 3.641e-12 x + 2.149e-08 x - 5.822e-05 x + 0.05093 x + 4.692

Process finished with exit code 0

在这里插入图片描述
这图象咋这么奇怪呢？

我擦，原来是x的间隔设置太大了
在这里插入图片描述
改一改：

# -*- coding: utf-8 -*-
"""
@File    : plot.py
@Time    : 2020/2/24 8:55
@Author  : Dontla
@Email   : sxana@qq.com
@Software: PyCharm
"""

import matplotlib.pyplot as plt

import numpy as np

# 如发现格式不对可用记事本或notepad批量替换
keyword = {'11:30.0': (50000, 13.96), '12:16.0': (54500, 13.20), '13:15.0': (47500, 12.48),
           '14:22.0': (55450, 12.44), '14:35.0': (55430, 13.72), '17:03.0': (13990, 11.00),
           '17:38.0': (9058, 11.60), '17:57.0': (5044, 12.46), '18:20.0': (1300, 13.80),
           '18:25.0': (900, 13.90), '18:28.0': (700, 13.96), '18:40.0': (200, 13.34),
           '18:42.0': (150, 13.10), '18:44.0': (100, 11.80), '18:44.2': (90, 11.34),
           '18.44.4': (80, 11.38), '18:44.8': (70, 9.50), '18:45.0': (60, 9.20),
           '18:46.0': (50, 11.9), '18:46.3': (40, 10.8), '18:46.6': (30, 9.20),
           '18:49.0': (20, 9.70), '18:49.6': (15, 6.90), '18:50.3': (13, 4.70),
           '18:50.9': (12, 3.80), '18:51.5': (11, 2.60), '18:52.2': (10, 1.70),
           '18:52.9': (9, 1.00), '18:53.6': (8, 0.2), '18:54.3': (7, 0.06),
           '18:55.0': (6, 0.02)}

data = []

for key in keyword:
    data.append(keyword[key])

data = np.array(data)
# print(data)
# [[5.000e+04 1.396e+01]
#  [5.450e+04 1.320e+01]
#  [4.750e+04 1.248e+01]
#  [5.545e+04 1.244e+01]
#  [5.543e+04 1.372e+01]
#  [1.399e+04 1.100e+01]
#  [9.058e+03 1.160e+01]
#  [5.044e+03 1.246e+01]
#  [1.300e+03 1.380e+01]
#  [9.000e+02 1.390e+01]
#  [7.000e+02 1.396e+01]
#  [2.000e+02 1.334e+01]
#  [1.500e+02 1.310e+01]
#  [1.000e+02 1.180e+01]
#  [9.000e+01 1.134e+01]
#  [8.000e+01 1.138e+01]
#  [7.000e+01 9.500e+00]
#  [6.000e+01 9.200e+00]
#  [5.000e+01 1.190e+01]
#  [4.000e+01 1.080e+01]
#  [3.000e+01 9.200e+00]
#  [2.000e+01 9.700e+00]
#  [1.500e+01 6.900e+00]
#  [1.300e+01 4.700e+00]
#  [1.200e+01 3.800e+00]
#  [1.100e+01 2.600e+00]
#  [1.000e+01 1.700e+00]
#  [9.000e+00 1.000e+00]
#  [8.000e+00 2.000e-01]
#  [7.000e+00 6.000e-02]
#  [6.000e+00 2.000e-02]]


x = data[:, 0]
# print(x)
# [5.000e+04 5.450e+04 4.750e+04 5.545e+04 5.543e+04 1.399e+04 9.058e+03
#  5.044e+03 1.300e+03 9.000e+02 7.000e+02 2.000e+02 1.500e+02 1.000e+02
#  9.000e+01 8.000e+01 7.000e+01 6.000e+01 5.000e+01 4.000e+01 3.000e+01
#  2.000e+01 1.500e+01 1.300e+01 1.200e+01 1.100e+01 1.000e+01 9.000e+00
#  8.000e+00 7.000e+00 6.000e+00]
y = data[:, 1]
# print(y)
# [13.96 13.2  12.48 12.44 13.72 11.   11.6  12.46 13.8  13.9  13.96 13.34
#  13.1  11.8  11.34 11.38  9.5   9.2  11.9  10.8   9.2   9.7   6.9   4.7
#   3.8   2.6   1.7   1.    0.2   0.06  0.02]

ind = np.lexsort((x,))
# print(ind)
# [30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7
#   6  5  2  0  1  4  3]

data_sort = [(x[i], y[i]) for i in ind]

# print(data_sort)
# [(6.0, 0.02), (7.0, 0.06), (8.0, 0.2), (9.0, 1.0), (10.0, 1.7), (11.0, 2.6), (12.0, 3.8), (13.0, 4.7), (15.0, 6.9), (20.0, 9.7), (30.0, 9.2), (40.0, 10.8), (50.0, 11.9), (60.0, 9.2), (70.0, 9.5), (80.0, 11.38), (90.0, 11.34), (100.0, 11.8), (150.0, 13.1), (200.0, 13.34), (700.0, 13.96), (900.0, 13.9), (1300.0, 13.8), (5044.0, 12.46), (9058.0, 11.6), (13990.0, 11.0), (47500.0, 12.48), (50000.0, 13.96), (54500.0, 13.2), (55430.0, 13.72), (55450.0, 12.44)]

x_sort, y_sort = np.array(data_sort)[:, 0], np.array(data_sort)[:, 1]

# 用3次多项式拟合  可以改为5 次多项式。。。。 返回三次多项式系数
z1 = np.polyfit(x_sort, y_sort, 5)
p1 = np.poly1d(z1)

# 在屏幕上打印拟合多项式
print(p1)
#            3             2
# 2.534e-13 x - 2.506e-08 x + 0.000714 x + 7.821

# 设置绘制间隔
x_lin = np.arange(0, 60000, 5)

yvals = p1(x_lin)  # 也可以使用yvals=np.polyval(z1,x)

plot1 = plt.plot(x_sort, y_sort, '*', label='original values')
plot2 = plt.plot(x_lin, yvals, 'r', label='polyfit values')

# 限制绘制上下限
plt.ylim(0, 16)

plt.xlabel('Illumination/lm')

plt.ylabel('Detect num/pcs')

plt.legend(loc=4)  # 指定legend的位置,读者可以自己help它的用法

plt.title('polyfitting')

plt.show()

plt.savefig('p1.png')