现实世界中存在大量的非线性模型,并且我们常常需要对模型中的未知参数进行估计。当模型的非线性强度较弱、精度要求合适时,我们可将非线性模型线性化获取其近似解。本文讨论以最小二乘法估计线性化后的模型参数。
1 非线性误差方程
非线性的观测方程可表示为
L
n
,
1
=
f
n
,
1
(
X
t
,
1
)
+
Δ
n
,
1
(1)
\mathop{L}\limits_{n,1}=\mathop{f}\limits_{n,1}(\mathop{X}\limits_{t,1})+\mathop{\Delta}\limits_{n,1} \tag{1}
n,1L=n,1f(t,1X)+n,1Δ(1)
式中,
f
(
X
)
=
(
f
1
(
X
)
f
2
(
X
)
⋯
f
n
(
X
)
)
T
f(X)=\left( f_{1}(X)f_{2}(X)\cdots f_{n}(X) \right)^{T}
f(X)=(f1(X)f2(X)⋯fn(X))T是由n个X的非线性函数组成的
n
×
1
n\times1
n×1向量。
当未知参数(X)、真误差(
Δ
\Delta
Δ)分别用其估值
X
^
\hat{X}
X^、
V
V
V代替时,则可得非线性误差方程
V
n
,
1
=
f
n
,
1
(
X
^
)
−
L
n
,
1
(2)
\mathop{V}\limits_{n,1}=\mathop{f}\limits_{n,1}(\hat{X})-\mathop{L}\limits_{n,1} \tag{2}
n,1V=n,1f(X^)−n,1L(2)
式中,
V
V
V也称为残差向量。
2 非线性模型的LS解
将非线性模型式
(
2
)
(2)
(2)在
X
0
X^{0}
X0处泰勒展开并取至一次项,得
V
=
∂
f
(
X
^
)
∂
X
^
∣
X
^
=
X
0
∙
δ
X
^
−
(
L
−
f
(
X
0
)
)
(3)
V={\frac{\partial f(\hat{X})}{\partial \hat{X}}\mid_{\hat{X}=X^{0}}}\bullet {\delta\hat{X}}-(L-f(X^{0})) \tag{3}
V=∂X^∂f(X^)∣X^=X0∙δX^−(L−f(X0))(3)
若令
l
=
L
−
f
(
X
0
)
l=L-f(X^{0})
l=L−f(X0),则
V
=
B
∙
δ
X
^
−
l
(4)
V=B\bullet\delta\hat{X}-l \tag{4}
V=B∙δX^−l(4)
式中,
B
=
(
∂
f
1
(
X
^
)
∂
X
1
^
∣
X
^
=
X
0
∂
f
1
(
X
^
)
∂
X
2
^
∣
X
^
=
X
0
⋯
∂
f
1
(
X
^
)
∂
X
t
^
∣
X
^
=
X
0
∂
f
2
(
X
^
)
∂
X
1
^
∣
X
^
=
X
0
∂
f
2
(
X
^
)
∂
X
2
^
∣
X
^
=
X
0
⋯
∂
f
2
(
X
^
)
∂
X
t
^
∣
X
^
=
X
0
⋮
⋮
⋮
⋮
∂
f
n
(
X
^
)
∂
X
1
^
∣
X
^
=
X
0
∂
f
n
(
X
^
)
∂
X
2
^
∣
X
^
=
X
0
⋯
∂
f
n
(
X
^
)
∂
X
t
^
∣
X
^
=
X
0
)
(5)
B=\begin{pmatrix} {\frac{\partial f_1(\hat{X})}{\partial \hat{X_1}}\mid_{\hat{X}=X^{0}}} & {\frac{\partial f_1(\hat{X})}{\partial \hat{X_2}}\mid_{\hat{X}=X^{0}}} & \cdots & {\frac{\partial f_1(\hat{X})}{\partial \hat{X_t}}\mid_{\hat{X}=X^{0}}} \\ {\frac{\partial f_2(\hat{X})}{\partial \hat{X_1}}\mid_{\hat{X}=X^{0}}} & {\frac{\partial f_2(\hat{X})}{\partial \hat{X_2}}\mid_{\hat{X}=X^{0}}} & \cdots & {\frac{\partial f_2(\hat{X})}{\partial \hat{X_t}}\mid_{\hat{X}=X^{0}}}\\ \vdots & \vdots & \vdots & \vdots \\ {\frac{\partial f_n(\hat{X})}{\partial \hat{X_1}}\mid_{\hat{X}=X^{0}}} & {\frac{\partial f_n(\hat{X})}{\partial \hat{X_2}}\mid_{\hat{X}=X^{0}}} & \cdots & {\frac{\partial f_n(\hat{X})}{\partial \hat{X_t}}\mid_{\hat{X}=X^{0}}}\\ \end{pmatrix} \tag{5}
B=⎝⎜⎜⎜⎜⎜⎛∂X1^∂f1(X^)∣X^=X0∂X1^∂f2(X^)∣X^=X0⋮∂X1^∂fn(X^)∣X^=X0∂X2^∂f1(X^)∣X^=X0∂X2^∂f2(X^)∣X^=X0⋮∂X2^∂fn(X^)∣X^=X0⋯⋯⋮⋯∂Xt^∂f1(X^)∣X^=X0∂Xt^∂f2(X^)∣X^=X0⋮∂Xt^∂fn(X^)∣X^=X0⎠⎟⎟⎟⎟⎟⎞(5)
称为系数矩阵(设计矩阵)。
根据最小二乘原理,
δ
X
^
=
(
B
T
P
B
)
−
1
B
T
P
l
(6)
\delta\hat{X}=(B^TPB)^{-1}B^{T}Pl \tag{6}
δX^=(BTPB)−1BTPl(6)
式中,
P
P
P为权阵(衡量各个分量对总体的重要程度)且
P
=
Q
−
1
P=Q^{-1}
P=Q−1。这样,待估参数
X
X
X的最终估计结果为
X
^
=
X
0
+
δ
X
^
(7)
\hat{X}=X^{0}+\delta \hat{X} \tag{7}
X^=X0+δX^(7)
3 实例分析
3.1 问题
已知非线性模型 f ( x ) = a e b x f(x)=ae^{bx} f(x)=aebx(其中 a , b a,b a,b为待估参数),通过观测我们获得了如下表所示的观测结果。现我们用LS估计未知参数 a , b a,b a,b。
x | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
f(x) | 4.20 | 3.25 | 2.52 | 1.95 | 1.51 |
3.2 解答
(1)列立观测方程
依题意可列如下观测方程(对应式
(
1
)
(1)
(1))
{
f
1
=
a
e
b
+
Δ
1
f
2
=
a
e
2
b
+
Δ
2
f
3
=
a
e
3
b
+
Δ
3
f
4
=
a
e
4
b
+
Δ
4
f
5
=
a
e
5
b
+
Δ
5
(8)
\begin{cases} f_1=ae^b+\Delta_1 \\ f_2=ae^{2b}+\Delta_2\\ f_3=ae^{3b}+\Delta_3\\ f_4=ae^{4b}+\Delta_4\\ f_5=ae^{5b}+\Delta_5\\ \end{cases} \tag{8}
⎩⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎧f1=aeb+Δ1f2=ae2b+Δ2f3=ae3b+Δ3f4=ae4b+Δ4f5=ae5b+Δ5(8)
此时,我们可随机取两组数据(因为未知参数仅有
a
,
b
a,b
a,b)带入
f
(
x
)
=
a
e
b
x
f(x)=ae^{bx}
f(x)=aebx中解得
(
a
b
)
\begin{pmatrix} a \\ b \\ \end{pmatrix}
(ab)的初始值为
(
a
0
b
0
)
=
(
5.4
−
0.3
)
\begin{pmatrix} a^0 \\ b^0 \\ \end{pmatrix}=\begin{pmatrix} 5.4 \\ -0.3 \\ \end{pmatrix}
(a0b0)=(5.4−0.3)。
(2)估值代替得误差方程(对应式
(
2
)
(2)
(2))
{
v
1
=
a
^
e
b
^
−
f
1
v
2
=
a
^
e
2
b
^
−
f
2
v
3
=
a
^
e
3
b
^
−
f
3
v
4
=
a
^
e
4
b
^
−
f
4
v
5
=
a
^
e
5
b
^
−
f
5
(9)
\begin{cases} v_1=\hat{a}e^{\hat{b}}-f_1 \\ v_2=\hat{a}e^{2\hat{b}}-f_2\\ v_3=\hat{a}e^{3\hat{b}}-f_3\\ v_4=\hat{a}e^{4\hat{b}}-f_4\\ v_5=\hat{a}e^{5\hat{b}}-f_5\\ \end{cases} \tag{9}
⎩⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎧v1=a^eb^−f1v2=a^e2b^−f2v3=a^e3b^−f3v4=a^e4b^−f4v5=a^e5b^−f5(9)
(3)将误差方程于
X
0
X^0
X0处泰勒展开(对应式
(
4
)
(4)
(4))
{
v
1
=
e
b
0
⋅
δ
a
+
a
0
e
b
0
⋅
δ
b
−
(
f
1
−
a
0
e
b
0
)
v
2
=
e
2
b
0
⋅
δ
a
+
2
a
0
e
2
b
0
⋅
δ
b
−
(
f
2
−
a
0
e
2
b
0
)
v
3
=
e
3
b
0
⋅
δ
a
+
3
a
0
e
3
b
0
⋅
δ
b
−
(
f
3
−
a
0
e
3
b
0
)
v
4
=
e
4
b
0
⋅
δ
a
+
4
a
0
e
4
b
0
⋅
δ
b
−
(
f
4
−
a
0
e
4
b
0
)
v
5
=
e
5
b
0
⋅
δ
a
+
5
a
0
e
5
b
0
⋅
δ
b
−
(
f
5
−
a
0
e
5
b
0
)
(10)
\begin{cases} v_1=e^{b^0}\cdot\delta a+a^0e^{b^0}\cdot\delta b-(f_1-a^0e^{b^0}) \\ v_2=e^{2b^0}\cdot\delta a+2a^0e^{2b^0}\cdot\delta b-(f_2-a^0e^{2b^0})\\ v_3=e^{3b^0}\cdot\delta a+3a^0e^{3b^0}\cdot\delta b-(f_3-a^0e^{3b^0})\\ v_4=e^{4b^0}\cdot\delta a+4a^0e^{4b^0}\cdot\delta b-(f_4-a^0e^{4b^0})\\ v_5=e^{5b^0}\cdot\delta a+5a^0e^{5b^0}\cdot\delta b-(f_5-a^0e^{5b^0})\\ \end{cases} \tag{10}
⎩⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎧v1=eb0⋅δa+a0eb0⋅δb−(f1−a0eb0)v2=e2b0⋅δa+2a0e2b0⋅δb−(f2−a0e2b0)v3=e3b0⋅δa+3a0e3b0⋅δb−(f3−a0e3b0)v4=e4b0⋅δa+4a0e4b0⋅δb−(f4−a0e4b0)v5=e5b0⋅δa+5a0e5b0⋅δb−(f5−a0e5b0)(10)
带入具体数据,则式
(
10
)
(10)
(10)可化为
V
=
(
0.7408
4.0004
0.5488
5.9272
0.4066
6.5864
0.3012
6.5058
0.2231
6.0245
)
(
δ
a
δ
b
)
−
(
0.1996
0.2864
0.3245
0.3236
0.3051
)
V=\begin{pmatrix} 0.7408 & 4.0004 \\ 0.5488 & 5.9272 \\ 0.4066 & 6.5864\\ 0.3012 & 6.5058\\ 0.2231 & 6.0245\\ \end{pmatrix} \begin{pmatrix} \delta a\\ \delta b\\ \end{pmatrix}- \begin{pmatrix} 0.1996\\ 0.2864\\ 0.3245\\ 0.3236\\ 0.3051\\ \end{pmatrix}
V=⎝⎜⎜⎜⎜⎛0.74080.54880.40660.30120.22314.00045.92726.58646.50586.0245⎠⎟⎟⎟⎟⎞(δaδb)−⎝⎜⎜⎜⎜⎛0.19960.28640.32450.32360.3051⎠⎟⎟⎟⎟⎞
那么,依式
(
6
)
(6)
(6)可解得
(
δ
a
δ
b
)
=
(
−
0.005858021
0.049953787
)
\begin{pmatrix} \delta a\\ \delta b\\ \end{pmatrix}= \begin{pmatrix} -0.005858021\\ 0.049953787\\ \end{pmatrix}
(δaδb)=(−0.0058580210.049953787)
则未知参数最终估值为
(
a
^
b
^
)
=
(
a
0
b
0
)
+
(
δ
a
δ
b
)
=
(
5.394141979
−
0.250246213
)
\begin{pmatrix} \hat{a}\\ \hat{b}\\ \end{pmatrix}= \begin{pmatrix} a^0\\ b^0\\ \end{pmatrix}+ \begin{pmatrix} \delta a\\ \delta b\\ \end{pmatrix}= \begin{pmatrix} 5.394141979\\ -0.250246213\\ \end{pmatrix}
(a^b^)=(a0b0)+(δaδb)=(5.394141979−0.250246213)
因此,我们求得的非线性模型为
f
(
x
)
=
5.394141979
e
−
0.250246213
x
f(x)=5.394141979e^{-0.250246213x}
f(x)=5.394141979e−0.250246213x。