所谓最小二乘法,即通过对数据进行拟合,使得拟合值与样本值的方差最小。
线性拟合
假设样本为
{
x
n
}
=
x
1
,
x
2
.
.
.
x
n
,
{
y
n
}
=
y
1
,
y
2
.
.
.
y
n
\{x_n\}={x_1,x_2...x_n},\{y_n\}={y_1,y_2...y_n}
{xn}=x1,x2...xn,{yn}=y1,y2...yn,其拟合之后的方程为
y
=
a
x
+
b
y=ax+b
y=ax+b。则拟合值与样本值之差即为误差,误差的平方和可以衡量总误差:
J
(
a
,
b
)
=
∑
i
=
1
n
(
y
i
−
a
x
i
−
b
)
2
J(a,b)=\sum_{i=1}^{n}{(y_i-ax_i-b)^2}
J(a,b)=i=1∑n(yi−axi−b)2
对于误差函数,当其导数为0时有极值,故对误差函数求偏导数并使之为0:
∂ J ∂ a = ∑ i = 1 n − 2 x i ( y i − a x i − b ) = 0 ∂ J ∂ b = ∑ i = 1 n − 2 ( y i − a x i − b ) = 0 \begin{aligned} \frac{\partial J}{\partial a}&=\sum_{i=1}^{n}-2x_i{(y_i-ax_i-b)}&=0\\ \frac{\partial J}{\partial b}&=\sum_{i=1}^{n}-2{(y_i-ax_i-b)}&=0 \end{aligned} ∂a∂J∂b∂J=i=1∑n−2xi(yi−axi−b)=i=1∑n−2(yi−axi−b)=0=0可得
∑ i = 1 n x i ( y i − a x i − b ) = 0 ∑ i = 1 n ( y i − a x i − b ) = 0 \begin{aligned} \sum_{i=1}^{n}x_i{(y_i-ax_i-b)}&=0\\ \sum_{i=1}^{n}{(y_i-ax_i-b)}&=0 \end{aligned} i=1∑nxi(yi−axi−b)i=1∑n(yi−axi−b)=0=0
约定记号 S ( x y ) = ∑ x i y i S(xy)=\sum{x_iy_i} S(xy)=∑xiyi, S ( x 2 ) = ∑ x i 2 S(x^2)=\sum{x_i^2} S(x2)=∑xi2, S ( x ) = ∑ x i S(x)=\sum{x_i} S(x)=∑xi, E ( x ) = 1 n ∑ x i E(x)=\frac{1}{n}\sum{x_i} E(x)=n1∑xi, E ( y ) = 1 n ∑ y i E(y)=\frac{1}{n}\sum{y_i} E(y)=n1∑yi,则上式化为
S
(
x
y
)
−
a
S
(
x
2
)
−
b
S
(
x
)
=
0
E
(
y
)
−
a
E
(
x
)
−
b
=
0
\begin{aligned} S(xy)-aS(x^2)-bS(x)&=0\\ E(y)-aE(x)-b&=0 \end{aligned}
S(xy)−aS(x2)−bS(x)E(y)−aE(x)−b=0=0
易得
a
=
S
(
x
y
)
−
E
(
y
)
S
(
x
)
S
(
x
2
)
−
E
(
x
)
S
(
x
)
b
=
E
(
y
)
S
(
x
2
)
−
S
(
x
y
)
E
(
x
)
S
(
x
2
)
−
E
(
x
)
S
(
x
)
\begin{aligned} a &= \frac{S(xy)-E(y)S(x)}{S(x^2)-E(x)S(x)}\\ b &= \frac{E(y)S(x^2)-S(xy)E(x)}{S(x^2)-E(x)S(x)} \end{aligned}
ab=S(x2)−E(x)S(x)S(xy)−E(y)S(x)=S(x2)−E(x)S(x)E(y)S(x2)−S(xy)E(x)
这个表达式还是非常简单的。
对于有些情况,我们往往选取自然序列作为自变量,这个时候在求自变量的取值时可以用到一些初等数学的推论,对于
x
∈
[
m
,
n
]
x\in [m,n]
x∈[m,n]的自然序列来说,有
S
(
x
)
=
(
m
+
n
)
(
n
−
m
+
1
)
2
S
(
x
2
)
=
n
(
n
+
1
)
(
2
n
+
1
)
−
m
(
n
−
1
)
(
2
m
−
1
)
6
\begin{aligned} S(x) &= \frac{(m+n)(n-m+1)}{2}\\ S(x^2)&= \frac{n(n+1)(2n+1)-m(n-1)(2m-1)}{6} \end{aligned}
S(x)S(x2)=2(m+n)(n−m+1)=6n(n+1)(2n+1)−m(n−1)(2m−1)代码为:
#文件名core.py
import numpy as np
def leastSquare(x,y):
if len(x)==2:
#此时x为自然序列
sx = 0.5*(x[1]-x[0]+1)*(x[1]+x[0])
ex = sx/(x[1]-x[0]+1)
sx2 = ((x[1]*(x[1]+1)*(2*x[1]+1))
-(x[0]*(x[0]-1)*(2*x[0]-1)))/6
x = np.array(range(x[0],x[1]+1))
else:
sx = sum(x)
ex = sx/len(x)
sx2 = sum(x**2)
sxy = sum(x*y)
ey = np.mean(y)
a = (sxy-ey*sx)/(sx2-ex*sx)
b = (ey*sx2-sxy*ex)/(sx2-ex*sx)
return a,b
测试一下
>>> x = np.arange(25)
>>> y = x*15+20+np.random.randn(len(x))*5 #randn生成正态分布噪声
>>> a,b = core.leastSquare(x,y)
>>> plt.scatter(x,y) #原始数据散点图
<matplotlib.collections.PathCollection object at 0x00000218DEBBEDC8>
>>> plt.plot(x,a*x+b) #拟合直线
[<matplotlib.lines.Line2D object at 0x00000218E0314FC8>]
>>> plt.show()
得到
高阶多项式
对于高阶的多项式拟合,其思路与线性拟合是如出一辙的。对于样本 { x n } = x 1 , x 2 . . . x n , { y n } = y 1 , y 2 . . . y n \{x_n\}={x_1,x_2...x_n},\{y_n\}={y_1,y_2...y_n} {xn}=x1,x2...xn,{yn}=y1,y2...yn,假设其拟合之后的方程为 y = ∑ j = 0 m a j x j y=\sum_{j=0}^{m}{a_jx^j} y=j=0∑majxj则相应地其误差方程组可表示为
J ( a j ) = ∑ i = 0 n ( y i − ∑ j = 0 m a j x i j ) 2 J(a_j)=\sum^n_{i=0}{(y_i-\sum_{j=0}^{m}{a_jx_i^j})^2} J(aj)=i=0∑n(yi−j=0∑majxij)2则其每个参数的偏导数可表示为
∂ J ∂ a k = ∑ i = 0 n 2 ⋅ x i k ( y i − ∑ j = 0 m a j x i j ) = 0 \frac{\partial J}{\partial a_k}=\sum^n_{i=0}{ 2\cdot x_i^k(y_i-\sum_{j=0}^{m}{a_jx_i^j})}=0 ∂ak∂J=i=0∑n2⋅xik(yi−j=0∑majxij)=0即
∑ i = 0 n x i k y i − ∑ i = 0 n ∑ j = 0 m a j x i j + k = 0 \sum^n_{i=0}{ x_i^{k}y_i}-\sum^n_{i=0}{\sum_{j=0}^{m}{a_j x_i^{j+k}}}=0 i=0∑nxikyi−i=0∑nj=0∑majxij+k=0
和前面一样,约定
S
k
=
∑
i
=
0
n
x
i
k
y
i
,
S
k
j
=
∑
j
=
0
m
x
i
j
+
k
S_k=\sum^n_{i=0}{ x_i^{k}y_i},S_{kj}=\sum_{j=0}^{m}{x_i^{j+k}}
Sk=i=0∑nxikyi,Skj=j=0∑mxij+k,则对于任意
i
i
i值,上式可变为
S
k
−
⋅
∑
j
=
0
m
a
j
S
k
j
=
0
S_k-\cdot \sum_{j=0}^{m}a_jS{kj}=0
Sk−⋅j=0∑majSkj=0
写成矩阵的形式即为
[
S
00
S
01
.
.
.
S
0
m
S
10
S
11
.
.
.
S
1
m
.
.
.
.
.
.
.
.
.
.
.
.
S
m
0
S
m
1
.
.
.
S
m
m
]
⋅
[
a
0
a
1
.
.
.
a
m
]
=
[
S
0
S
1
.
.
.
S
m
]
\left[\begin{matrix} S_{00}&S_{01}&...&S_{0m}\\ S_{10}&S_{11}&...&S_{1m}\\ ...&...&...&...\\ S_{m0}&S_{m1}&...&S_{mm} \end{matrix}\right]\cdot \left[\begin{matrix}a_0\\a_1\\...\\a_m \end{matrix}\right] =\left[\begin{matrix}S_0\\S_1\\...\\S_m \end{matrix}\right]
S00S10...Sm0S01S11...Sm1............S0mS1m...Smm
⋅
a0a1...am
=
S0S1...Sm
代码如下
#传入参数格式为np.array,n为阶数
def leastSquareMulti(x,y,n):
X = [np.sum(x**i) for i in range(2*n+1)]
Y = np.array([[np.sum(y*x**i)] for i in range(n+1)])
S = np.array([X[i:i+n+1] for i in range(n+1)])
return np.linalg.solve(S,Y) #
经测试结果如下:
>>> x = np.arange(25)
>>> y = x**3+3*x**2+2*x+12
>>> import core
>>> core.leastSquareMulti(x,y,3)
array([[12.], #此为常数项
[ 2.],
[ 3.],
[ 1.]])
多自变量
对于样本
{
x
1
n
}
=
x
11
,
x
12
.
.
.
x
1
n
,
{
x
2
n
}
=
x
21
,
x
22
.
.
.
x
2
n
,
.
.
.
.
.
.
.
{
x
m
n
}
=
x
m
1
,
x
m
2
.
.
.
x
m
n
,
{
y
n
}
=
y
1
,
y
2
.
.
.
y
n
\begin{aligned} \{x_{1n}\}&=x_{11},x_{12}&...&x_{1n},\\ \{x_{2n}\}&=x_{21},x_{22}&...&x_{2n},\\ &....&...&\\ \{x_{mn}\}&=x_{m1},x_{m2}&...&x_{mn},\\ \{y_n\}&=y_1,y_2&...&y_n \end{aligned}
{x1n}{x2n}{xmn}{yn}=x11,x12=x21,x22....=xm1,xm2=y1,y2...............x1n,x2n,xmn,yn
假设其拟合之后的方程为
y
=
∑
j
=
1
m
a
j
x
j
y=\sum_{j=1}^{m}{a_jx_j}
y=j=1∑majxj则相应地其误差方程组可表示为
J ( a j ) = ∑ i = 1 n ( y i − ∑ j = 1 m a j x j i ) 2 J(a_j)=\sum^n_{i=1}{(y_i-\sum_{j=1}^{m}{a_jx_{ji}})^2} J(aj)=i=1∑n(yi−j=1∑majxji)2则其每个参数的偏导数可表示为
∂ J ∂ a k = ∑ i = 1 n 2 ⋅ x k i ( y i − ∑ j = 1 m a j x j i ) = 0 \frac{\partial J}{\partial a_k}=\sum^n_{i=1}{ 2\cdot x_{ki}(y_i-\sum_{j=1}^{m}{a_jx_{ji}})}=0 ∂ak∂J=i=1∑n2⋅xki(yi−j=1∑majxji)=0即
∑ i = 1 n x k i y i − ∑ i = 1 n ∑ j = 1 m a j x j i x k i = 0 \sum^n_{i=1}{ x_{ki}y_i}-\sum^n_{i=1}{\sum_{j=1}^{m}{a_j x_{ji}x_{ki}}}=0 i=1∑nxkiyi−i=1∑nj=1∑majxjixki=0
约定
Y
k
=
∑
i
=
1
n
x
k
i
y
i
Y_k=\sum^n_{i=1}{ x_{ki}y_i}
Yk=∑i=1nxkiyi,
X
j
k
=
∑
i
=
1
n
x
j
i
x
k
i
X_{jk}=\sum^n_{i=1}{x_{ji}x_{ki}}
Xjk=∑i=1nxjixki,其矩阵形式为
[
X
11
X
12
.
.
.
X
1
m
X
21
X
22
.
.
.
X
2
m
.
.
.
.
.
.
.
.
.
.
.
.
X
m
1
X
m
2
.
.
.
X
m
m
]
⋅
[
a
1
a
2
.
.
.
a
m
]
=
[
Y
1
Y
2
.
.
.
Y
m
]
\left[\begin{matrix} X_{11}&X_{12}&...&X_{1m}\\ X_{21}&X_{22}&...&X_{2m}\\ ...&...&...&...\\ X_{m1}&X_{m2}&...&X_{mm} \end{matrix}\right]\cdot \left[\begin{matrix}a_1\\a_2\\...\\a_m \end{matrix}\right] =\left[\begin{matrix}Y_1\\Y_2\\...\\Y_m \end{matrix}\right]
X11X21...Xm1X12X22...Xm2............X1mX2m...Xmm
⋅
a1a2...am
=
Y1Y2...Ym
如果最终的拟合方程需要常数项,则只需对 x x x增添一组值为1的样本即可,其对应的 a m + 1 a_{m+1} am+1即为常数项。
在具体的编程中,假设其输入的自变量为一个矩阵 X X X,每行代表某一自变量的不同取值,列表示每一组取值的不同自变量。那么上式左侧的系数矩阵可以表示为 X ⋅ X T X\cdot X^T X⋅XT。
指数函数
一般来说,对于形如 y = a e b x y=ae^{bx} y=aebx这样的函数来说,只需左右取对数,便可得到形如 l n y = b x + l n a lny=bx+lna lny=bx+lna这样的线性形式,通过简单的坐标变换,即可得到 b b b和 l n a lna lna的值。
然而,对于形如 y = a 1 e b 1 x + a 2 e b 2 x y=a_1e^{b_1x}+a_2e^{b_2x} y=a1eb1x+a2eb2x的函数,便无能为力了。
这时,如果 x x x是一个自然序列,或者间距恒定,那么我们可以通过上述表达式构建一个线性关系。设 x x x的间距为 δ \delta δ,约定 y 0 = a 1 e b 1 x + a 2 e b 2 x = y y 1 = a 1 e b 1 ( x + δ ) + a 2 e b 2 ( x + δ ) y 2 = a 1 e b 1 ( x + 2 δ ) + a 2 e b 2 ( x + 2 δ ) \begin{aligned} y_0&=a_1e^{b_1x}+a_2e^{b_2x}=y\\ y_1&=a_1e^{b_1(x+\delta)}+a_2e^{b_2(x+\delta)}\\ y_2&=a_1e^{b_1(x+2\delta)}+a_2e^{b_2(x+2\delta)} \end{aligned} y0y1y2=a1eb1x+a2eb2x=y=a1eb1(x+δ)+a2eb2(x+δ)=a1eb1(x+2δ)+a2eb2(x+2δ)
对于上式,可以得到关系 y 2 = y 1 ⋅ ( e b 1 δ + e b 2 δ ) − y 0 ⋅ e b 1 δ ⋅ e b 2 δ y_2=y_1\cdot(e^{b_1\delta}+e^{b_2\delta})-y_0\cdot e^{b_1\delta}\cdot e^{b_2\delta} y2=y1⋅(eb1δ+eb2δ)−y0⋅eb1δ⋅eb2δ,即可通过最小二乘法求出 ( e b 1 δ + e b 2 δ ) (e^{b_1\delta}+e^{b_2\delta}) (eb1δ+eb2δ)和 e b 1 δ ⋅ e b 2 δ e^{b_1\delta}\cdot e^{b_2\delta} eb1δ⋅eb2δ。
则由 y 2 / y 1 , y 2 / y 0 y_2/y_1,y_2/y_0 y2/y1,y2/y0得到的 a , b a,b a,b可组成一元二次方程 x 2 − a ⋅ x − b = 0 x^2-a\cdot x-b=0 x2−a⋅x−b=0的解即为 e b 1 δ e^{b_1\delta} eb1δ和 e b 2 δ e^{b_2\delta} eb2δ。由于 δ \delta δ是定值,故可得到 b 1 、 b 2 b_1、b_2 b1、b2的值。
这时,我们可以选取两种不同的技术方案,其一是将其转化为多元最小二乘拟合,令 x 1 = e b 1 x , x 2 = e b 2 x x_1=e^{b_1x},x_2=e^{b_2x} x1=eb1x,x2=eb2x。另一种则是令 X = e ( b 1 − b 2 ) x , Y = y e b 2 x X = e^{(b_1-b_2)x},Y=\frac{y}{e^{b_2x}} X=e(b1−b2)x,Y=eb2xy,则拟合方程变为 Y = a 1 X + a 2 Y=a_1X+a_2 Y=a1X+a2。
令 δ \delta δ为1,则其代码为
def expFit(x,y):
y0 = y[0:-3]
y1 = y[1:-2]
y2 = y[2:-1]
B,C = leastSquare(y2/y0,y1/y0)
b1 = np.log((B-np.sqrt(B**2+4*C))/2)
b2 = np.log((B+np.sqrt(B**2+4*C))/2)
X = np.exp(b1-b2)*x
Y = y/np.exp(b2*x)
a1,a2 = leastSquare(X,Y)
return a1,a2,b1,b2