目录
批处理最小二乘方法
考虑线性输入输出系统,其输出值
y
y
y可由输入状态
x
\bf{x}
x线性表示,数学模型描述为:
y
=
x
T
θ
+
ξ
y = {\bf{x}} ^T\theta + \xi
y=xTθ+ξ其中,
ξ
\xi
ξ为系统白噪声;
x
{\bf{x}}
x和
θ
\theta
θ为维度相同的列向量,分别表示为
x
=
[
x
1
x
2
⋯
x
n
]
T
{\bf{x}} = {\left[ {\begin{array}{c} {{x_1}}&{{x_2}}& \cdots &{{x_n}} \end{array}} \right]^T}
x=[x1x2⋯xn]T,
θ
=
[
c
1
c
2
⋯
c
n
]
T
\theta = {\left[{\begin{array}{c} {{c_1}}&{{c_2}}& \cdots &{{c_n}}\end{array}} \right]^T}
θ=[c1c2⋯cn]T。
当获取到
k
k
k组输入输出信息,所有这些信息可被用于参数辨识,记作输入矩阵
X
(
k
)
=
[
x
(
1
)
x
(
2
)
⋯
x
(
k
)
]
T
X\left( k \right) = {\left[ {\begin{array}{c} {{\bf{x}}\left( 1 \right)}&{{\bf{x}}\left( 2 \right)}& \cdots &{{\bf{x}}\left( {k } \right)}\end{array}} \right]^T}
X(k)=[x(1)x(2)⋯x(k)]T,输出矩阵
Y
(
k
)
=
[
y
(
1
)
y
(
2
)
⋯
y
(
k
)
]
T
Y\left( k \right) = {\left[ {\begin{array}{c}{y\left( 1 \right)}&{y\left( 2 \right)}& \cdots &{y\left( k \right)}\end{array}} \right]^T}
Y(k)=[y(1)y(2)⋯y(k)]T。参数辨识的本质是对
θ
\theta
θ进行最优估计
θ
^
\hat \theta
θ^,从而实现:
min
J
=
(
X
θ
^
−
Y
)
T
(
X
θ
^
−
Y
)
\min\quad J = {\left( {X\hat \theta - Y} \right)^T}\left( {X\hat \theta - Y} \right)
minJ=(Xθ^−Y)T(Xθ^−Y)
对上式求取
θ
^
\hat \theta
θ^ 的偏导数并设置为零向量,
∂
J
∂
θ
^
=
2
X
T
X
θ
^
−
2
X
T
Y
=
0
\frac{{\partial J}}{{\partial \hat \theta }} = 2{X^T}X\hat \theta - 2{X^T}Y = {\bf{0}}
∂θ^∂J=2XTXθ^−2XTY=0。求解得到全局极值点,如下所示。由于
J
J
J的二阶偏导数
∂
2
J
∂
θ
^
2
=
2
X
T
X
\frac{{\partial^2J}}{{\partial \hat \theta^2 }} = 2{X^T}X
∂θ^2∂2J=2XTX为正定矩阵,该解必然为全局最小值。
θ
^
(
k
)
=
(
X
T
X
)
−
1
X
T
Y
\hat \theta \left( k \right) = {\left( {{X^T}X} \right)^{ - 1}}{X^T}Y
θ^(k)=(XTX)−1XTY
递推最小二乘方法
随着
k
k
k的增大,上述批处理最小二乘方法需要占用大量的存储资源,且存在求解效率低、实时性差等缺点。递推形式的最小二乘参数辨识可以有效解决这些问题,
θ
^
(
k
)
\hat \theta \left( k \right)
θ^(k)的递归形式可以被表示为:
{
θ
^
(
k
)
=
θ
^
(
k
−
1
)
+
K
(
k
)
[
y
(
k
)
−
x
T
(
k
)
θ
^
(
k
−
1
)
]
K
(
k
)
=
P
(
k
−
1
)
x
(
k
)
1
+
x
T
(
k
)
P
(
k
−
1
)
x
(
k
)
P
(
k
)
=
[
I
−
K
(
k
)
x
T
(
k
)
]
P
(
k
−
1
)
\left\{ {\begin{array}{l} {\hat \theta \left( k \right) = \hat \theta \left( {k - 1} \right) + K\left( k \right)\left[ {y\left( k \right) - {{\bf{x}}^T}\left( {k} \right)\hat \theta \left( {k - 1} \right)} \right]}\\ {K\left( k \right) = \frac{{P\left( {k - 1} \right){\bf{x}}\left( {k } \right)}}{{1 + {{\bf{x}}^T}\left( {k } \right)P\left( {k - 1} \right){\bf{x}}\left( {k } \right)}}}\\ {P(k) = \left[ {I - K\left( k \right){{\bf{x}}^T}\left( {k } \right)} \right]P\left( {k - 1} \right)} \end{array}} \right.
⎩
⎨
⎧θ^(k)=θ^(k−1)+K(k)[y(k)−xT(k)θ^(k−1)]K(k)=1+xT(k)P(k−1)x(k)P(k−1)x(k)P(k)=[I−K(k)xT(k)]P(k−1)其中,
K
(
k
)
K\left( k \right)
K(k)为增益向量;
P
(
k
)
=
(
X
T
X
)
−
1
P\left( k \right) = {\left( {{X^T}X} \right)^{ - 1}}
P(k)=(XTX)−1表示误差的协方差矩阵,其初始值可由
(
X
T
X
)
−
1
{\left( {{X^T}X} \right)^{ - 1}}
(XTX)−1给出或设置为
α
I
\alpha I
αI,
I
I
I为单位矩阵,
α
=
1
0
6
∼
1
0
10
\alpha = {10^6} \sim {10^{10}}
α=106∼1010。
带有遗忘因子的递推最小二乘方法
递推最小二乘参数辨识方法可以有效缓解存储与计算压力,但依然存在着数据饱和问题。这是最小二乘算法本身属性所决定的,算法中的所有数据被同等对待,这也就意味着新数据所带来的更新效果会被逐渐削弱。这种内在属性使得一般最小二乘算法难以收敛到参数真实值,且无法跟踪运行过程中的参数变化。为解决该问题,可以引入具有遗忘因子的最小二乘参数辨识方法,对数据的时序进行考虑,并将原始优化问题修改为:
min
J
=
(
X
θ
^
−
Y
)
T
W
(
X
θ
^
−
Y
)
\min\quad J = {\left( {X\hat \theta - Y} \right)^T}W\left( {X\hat \theta - Y} \right)
minJ=(Xθ^−Y)TW(Xθ^−Y)其中,
W
=
d
i
a
g
(
λ
k
−
1
,
λ
k
−
2
,
.
.
.
,
λ
,
1
)
W = diag\left( {{\lambda ^{k - 1}},{\lambda ^{k - 2}},...,\lambda ,1} \right)
W=diag(λk−1,λk−2,...,λ,1)为加权对角矩阵,
λ
\lambda
λ为遗忘因子且
0.98
<
λ
<
1
0.98<\lambda<1
0.98<λ<1。
很容易理解,历史观测数据距离当前时刻越远,其所的占权重也会越低,参数辨识的过程会更依赖于近期加入的数据。对上式进行求解,可以得到具有遗忘因子的递推最小二乘参数辨识方法:
{
θ
^
(
k
)
=
θ
^
(
k
−
1
)
+
K
(
k
)
[
y
(
k
)
−
x
T
(
k
)
θ
^
(
k
−
1
)
]
K
(
k
)
=
P
(
k
−
1
)
x
(
k
)
λ
+
x
T
(
k
)
P
(
k
−
1
)
x
(
k
)
P
(
k
)
=
1
λ
[
I
−
K
(
k
)
x
T
(
k
)
]
P
(
k
−
1
)
\left\{ {\begin{array}{l} {\hat \theta \left( k \right) = \hat \theta \left( {k - 1} \right) + K\left( k \right)\left[ {y\left( k \right) - {{\bf{x}}^T}\left( {k } \right)\hat \theta \left( {k - 1} \right)} \right]}\\ {K\left( k \right) = \frac{{P\left( {k - 1} \right){\bf{x}}\left( {k } \right)}}{{\lambda + {{\bf{x}}^T}\left( {k} \right)P\left( {k - 1} \right){\bf{x}}\left( {k} \right)}}}\\ {P\left( k \right) = \frac{1}{\lambda }\left[ {I - K\left( k \right){{\bf{x}}^T}\left( {k } \right)} \right]P\left( {k - 1} \right)} \end{array}} \right.
⎩
⎨
⎧θ^(k)=θ^(k−1)+K(k)[y(k)−xT(k)θ^(k−1)]K(k)=λ+xT(k)P(k−1)x(k)P(k−1)x(k)P(k)=λ1[I−K(k)xT(k)]P(k−1)
其中,协方差矩阵 P ( k ) = ( X T W X ) − 1 P\left( k \right) = {\left( {{X^T}WX} \right)^{ - 1}} P(k)=(XTWX)−1,其初始值的选取原则与一般递归最小而成方法保持一致。
Matlab案例分析
自写代码
假设系统输出
y
y
y与时间
t
t
t满足如下关系表达式,并已知
t
=
1
:
100
t = 1:100
t=1:100期间的
y
y
y值,求取系数
c
1
,
c
2
,
c
3
c_1,c_2,c_3
c1,c2,c3的最优表达式:
y
=
c
1
t
+
c
2
(
t
−
60
)
2
+
c
3
cos
(
π
t
/
10
)
+
ξ
y = {c_1}t + {c_2}{\left( {t - 60} \right)^2} + {c_3}\cos \left( {\pi t/10} \right) + \xi
y=c1t+c2(t−60)2+c3cos(πt/10)+ξ
通过定义
x
1
=
t
{x_1} = t
x1=t,
x
2
=
(
t
−
60
)
2
{x_2} = {\left( {t - 60} \right)^2}
x2=(t−60)2,
x
3
=
cos
(
π
t
/
10
)
{x_3} = \cos \left( {\pi t/10} \right)
x3=cos(πt/10),我们完全可以把上述表达式转换为一个标准的线性表达式:
y
=
c
1
x
1
+
c
2
x
2
+
c
3
x
3
+
ξ
y = {c_1}{x_1} + {c_2}{x_2} + {c_3}{x_3} + \xi
y=c1x1+c2x2+c3x3+ξ
于是,我们可以通过递推最小二乘方法对系数进行求解,MATLAB代码如下:
clc; clear
% 构建三个输入项,一共101组
t = 0 : 1 : 100;
x1 = t;
x2 = (t - 60).^2;
x3 = cos(pi.*t / 10);
% 假设系统真实系数
c1 = 2.22;
c2 = 0.05;
c3 = 23.1;
% 得到系统输出,得到对应的101个输出(人为加入了一些噪声)
Y = c1 * x1 + c2 * x2 + c3 * x3 + randn(1,101);
% 绘制一下采样数据
plot(t, Y,'o')
hold on
% 设置初始值,执行递推过程
theta = [0;0;0];
Pk_ = 1e6 * eye(3);
lambda = 0.995;
for i = 1 : 1 : 100
x = [x1(i); x2(i); x3(i)];
y = Y(i);
Kk = Pk_ * x / (lambda + x' * Pk_ * x);
theta = theta + Kk * (y - x'*theta);
Pk_ = (1/lambda)*(eye(3) - Kk * x') * Pk_;
end
% 基于最小二乘的结果绘制曲线,并于原始数据做对比
Y_Est = theta(1) * x1 + theta(2) * x2 + theta(3) * x3;
% 绘制出来拟合的曲线,并于采样点进行比较
plot(t, Y,'r')
最终估算的参数向量theta = [2.2198; 0.0499; 23.0333],与系统真实参数 c 1 = 2.22 c1 = 2.22 c1=2.22, c 2 = 0.05 c2 = 0.05 c2=0.05, c 3 = 23.1 c3 = 23.1 c3=23.1是近似相等的。注意:由于随机噪声的存在。每次跑的结果可能会存在微小的差异。基于估算系数进行曲线拟合,并对比采样数据,结果如下:
matlab之lsqcurvefit函数
如果只是简单应用,是可以直接调用matlab库函数的。lsqcurvefit函数基于最小二乘方法,进行曲线拟合,该函数内部进行了很多的优化以满足对非线性拟合的需要。
clc; clear;
% 生成一组模拟数据
x = linspace(-10, 10, 100);
y = 2*x.^2 - 3*sin(x) + 0.5*randn(size(x));
% 定义要拟合的函数形式,包括二次项和三角函数项
f = @(p,x) p(1)*x.^2 + p(2)*sin(p(3)*x) + p(4);
% 定义初始参数矩阵,其中包含二次项系数、三角函数的振幅、频率和常数项
p0 = [1, 1, 1, 1]; % 对应真实值为[2, -3, 1, 0]
% 使用最小二乘法拟合曲线,并返回最优参数和拟合误差
[p, resnorm] = lsqcurvefit(f, p0, x, y);
% 生成拟合曲线并绘制
y_fit = f(p, x);
plot(x, y, 'o', x, y_fit, '-')
legend('原始数据', '拟合曲线')
lsqcurvefit的输入有四项,分别为待拟合函数、待拟合函数参数向量初始值、待拟合函数输入向量、采样结果,且待拟合函数定义的时候,必须将
x
x
x也作为输入的一部分,仿真结果如下所示。感觉用起来还挺费劲的,相比较而言,更推荐下面的fminsearch函数,比较直观一些。
matlab之fminsearch函数
相比较而言,fminsearch函数更为直观一些,因为其输入只需要包含两项:目标函数、参数向量初始值。而目标函数直接对所有误差进行了包含,只保留了参数向量作为输入项,或者说待优化项。代码如下所示:
clc; clear;
% Generate simulated data
x = linspace(0, 2*pi, 50)';
y = 2*sin(2*x) + 0.5*x.^2 + randn(size(x));
% Define error function
fun = @(c) sum((y - (c(1)*sin(c(2)*x) + c(3)*x.^2 + c(4))).^2);
% Find least-squares solution
c0 = [1 2 1 1]; % 对应真实值为[2 2 0.5 0]
c = fminsearch(fun, c0);
% Plot results
xfit = linspace(0, 2*pi, 100)';
yfit = c(1)*sin(c(2)*xfit) + c(3)*xfit.^2 + c(4);
plot(x, y, 'ko', xfit, yfit, 'b-');
legend('Data', 'Fit');
仿真结果如下所示:
附录1:递推最小二乘的数学证明
{ θ ^ ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 X k + 1 T Y k + 1 = [ X k T X k + x ( k + 1 ) x T ( k + 1 ) ] − 1 [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] ,见附录 1.1 = [ I − K ( k + 1 ) x T ( k + 1 ) ] P ( k ) [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] ,见附录 1.2 = [ I − K ( k + 1 ) x T ( k + 1 ) ] [ P ( k ) x ( k + 1 ) y ( k + 1 ) + θ ^ ( k ) ] = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + [ I − K ( k + 1 ) x T ( k + 1 ) ] P ( k ) x ( k + 1 ) y ( k + 1 ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + [ I − P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) x T ( k + 1 ) ] P ( k ) x ( k + 1 ) y ( k + 1 ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) [ 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) ] − P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + K ( k + 1 ) y ( k + 1 ) [ 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) ] − K ( k + 1 ) x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) = θ ^ ( k ) − K ( k + 1 ) x T ( k + 1 ) θ ^ ( k ) + K ( k + 1 ) y ( k + 1 ) + K ( k + 1 ) [ x T ( k + 1 ) P ( k ) x ( k + 1 ) ] 1 × 1 y ( k + 1 ) − K ( k + 1 ) x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) = θ ^ ( k ) − K ( k + 1 ) x T ( k + 1 ) θ ^ ( k ) + K ( k + 1 ) y ( k + 1 ) = θ ^ ( k ) + K ( k + 1 ) [ y ( k + 1 ) − x T ( k + 1 ) θ ^ ( k ) ] \left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right],见附录1.1\\ &= \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P(k)\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right],见附录1.2\\ &= \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\left[ {P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + \hat \theta \left( k \right)} \right]\\ &{\rm{ = }}\left[ {I - K\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}\left[ {I - \frac{{P(k){\bf{x}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}}{{\bf{x}}^{\rm{T}}}(k + 1)} \right]P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}\frac{{P\left( k \right){\bf{x}}\left( {k + 1} \right)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}}y\left( {k + 1} \right)\left[ {1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)} \right] - \frac{{P(k){\bf{x}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}}{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}K\left( {k + 1} \right)y\left( {k + 1} \right)\left[ {1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)} \right] - K\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\hat \theta \left( k \right) - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)\hat \theta \left( k \right){\rm{ + }}K\left( {k + 1} \right)y\left( {k + 1} \right) + K\left( {k + 1} \right){\left[ {{{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)} \right]_{1 \times 1}}y\left( {k + 1} \right) - K\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\ &{\rm{ = }}\hat \theta \left( k \right) - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)\hat \theta \left( k \right){\rm{ + }}K\left( {k + 1} \right)y\left( {k + 1} \right)\\ &{\rm{ = }}\hat \theta \left( k \right) + K(k + 1)\left[ {y\left( {k + 1} \right) - {{\bf{x}}^{\rm{T}}}(k + 1)\hat \theta \left( k \right)} \right] \end{aligned} \right. ⎩ ⎨ ⎧θ^(k+1)=[Xk+1TXk+1]−1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]−1[x(k+1)y(k+1)+XkTYk],见附录1.1=[I−K(k+1)xT(k+1)]P(k)[x(k+1)y(k+1)+XkTYk],见附录1.2=[I−K(k+1)xT(k+1)][P(k)x(k+1)y(k+1)+θ^(k)]=[I−K(k+1)xT(k+1)]θ^(k)+[I−K(k+1)xT(k+1)]P(k)x(k+1)y(k+1)=[I−K(k+1)xT(k+1)]θ^(k)+[I−1+xT(k+1)P(k)x(k+1)P(k)x(k+1)xT(k+1)]P(k)x(k+1)y(k+1)=[I−K(k+1)xT(k+1)]θ^(k)+1+xT(k+1)P(k)x(k+1)P(k)x(k+1)y(k+1)[1+xT(k+1)P(k)x(k+1)]−1+xT(k+1)P(k)x(k+1)P(k)x(k+1)xT(k+1)P(k)x(k+1)y(k+1)=[I−K(k+1)xT(k+1)]θ^(k)+K(k+1)y(k+1)[1+xT(k+1)P(k)x(k+1)]−K(k+1)xT(k+1)P(k)x(k+1)y(k+1)=θ^(k)−K(k+1)xT(k+1)θ^(k)+K(k+1)y(k+1)+K(k+1)[xT(k+1)P(k)x(k+1)]1×1y(k+1)−K(k+1)xT(k+1)P(k)x(k+1)y(k+1)=θ^(k)−K(k+1)xT(k+1)θ^(k)+K(k+1)y(k+1)=θ^(k)+K(k+1)[y(k+1)−xT(k+1)θ^(k)]
附录1.1:数学证明步骤一
根据批处理最小二乘方法:
θ
^
(
k
)
=
(
X
k
T
X
k
)
−
1
X
k
T
Y
k
\hat \theta \left( k \right) = {\left( {X_k^T{X_k}} \right)^{ - 1}}X_k^T{Y_k}
θ^(k)=(XkTXk)−1XkTYk 进一步,在下一刻的估计可以表示为:
θ
^
(
k
+
1
)
=
[
X
k
+
1
T
X
k
+
1
]
−
1
X
k
+
1
T
Y
k
+
1
\hat \theta \left( {k + 1} \right) = {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}
θ^(k+1)=[Xk+1TXk+1]−1Xk+1TYk+1其中,
X
k
+
1
=
[
X
k
x
T
(
k
+
1
)
]
,
Y
k
+
1
=
[
Y
k
y
(
k
+
1
)
]
{X_{k + 1}} = \left[ {\begin{array}{c} {{X_k}}\\ {{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \end{array}} \right],{Y_{k + 1}} = \left[ {\begin{array}{c} {{Y_k}}\\ {y\left( {k + 1} \right)} \end{array}} \right]
Xk+1=[XkxT(k+1)],Yk+1=[Yky(k+1)] 于是有
{
θ
^
(
k
+
1
)
=
[
X
k
+
1
T
X
k
+
1
]
−
1
X
k
+
1
T
Y
k
+
1
=
[
X
k
T
X
k
+
x
(
k
+
1
)
x
T
(
k
+
1
)
]
−
1
[
x
(
k
+
1
)
y
(
k
+
1
)
+
X
k
T
Y
k
]
\left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right] \end{aligned} \right.
⎩
⎨
⎧θ^(k+1)=[Xk+1TXk+1]−1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]−1[x(k+1)y(k+1)+XkTYk]
附录1.2:数学证明步骤二
为了证明这一步,首先介绍一个数学引理,这个稍后会用到。
【引理】设
A
A
A,
C
C
C和
A
+
B
C
D
A+BCD
A+BCD均为非奇异方阵,则有
{
(
A
+
B
C
D
)
−
1
=
A
−
1
−
A
−
1
B
(
C
−
1
+
D
A
−
1
B
)
−
1
D
A
−
1
特例
(
C
=
I
)
,
(
A
+
B
D
)
−
1
=
A
−
1
−
A
−
1
B
(
I
+
D
A
−
1
B
)
−
1
D
A
−
1
\left\{ \begin{array}{l} {(A + BCD)^{ - 1}} = {A^{ - 1}} - {A^{ - 1}}B{\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ 特例(C = I),{(A + BD)^{ - 1}} = {A^{ - 1}} - {A^{ - 1}}B{\left( {I + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}} \end{array} \right.
{(A+BCD)−1=A−1−A−1B(C−1+DA−1B)−1DA−1特例(C=I),(A+BD)−1=A−1−A−1B(I+DA−1B)−1DA−1 该引理的证明过程如下:
{
(
A
+
B
C
D
)
[
A
−
1
−
A
−
1
B
(
C
−
1
+
D
A
−
1
B
)
−
1
D
A
−
1
]
=
I
+
B
C
D
A
−
1
−
B
(
C
−
1
+
D
A
−
1
B
)
D
A
−
1
−
B
C
D
A
−
1
B
(
C
−
1
+
D
A
−
1
B
)
−
1
D
A
−
1
=
I
+
B
C
D
A
−
1
−
B
(
I
+
C
D
A
−
1
B
)
(
C
−
1
+
D
A
−
1
B
)
−
1
D
A
−
1
=
I
+
B
C
D
A
−
1
−
B
C
(
C
−
1
+
D
A
−
1
B
)
(
C
−
1
+
D
A
−
1
B
)
−
1
D
A
−
1
=
I
\left\{ \begin{array}{l} (A + BCD)\left[ {{A^{ - 1}} - {A^{ - 1}}B{{\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)}^{ - 1}}D{A^{ - 1}}} \right]\\ = I + BCD{A^{ - 1}} - B\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)D{A^{ - 1}} - BCD{A^{ - 1}}B{\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ = I + BCD{A^{ - 1}} - B\left( {I + CD{A^{ - 1}}B} \right){\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ = I + BCD{A^{ - 1}} - BC\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right){\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ = I \end{array} \right.
⎩
⎨
⎧(A+BCD)[A−1−A−1B(C−1+DA−1B)−1DA−1]=I+BCDA−1−B(C−1+DA−1B)DA−1−BCDA−1B(C−1+DA−1B)−1DA−1=I+BCDA−1−B(I+CDA−1B)(C−1+DA−1B)−1DA−1=I+BCDA−1−BC(C−1+DA−1B)(C−1+DA−1B)−1DA−1=I
上面已经证明到:
{
θ
^
(
k
+
1
)
=
[
X
k
+
1
T
X
k
+
1
]
−
1
X
k
+
1
T
Y
k
+
1
=
[
X
k
T
X
k
+
x
(
k
+
1
)
x
T
(
k
+
1
)
]
−
1
[
x
(
k
+
1
)
y
(
k
+
1
)
+
X
k
T
Y
k
]
\left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right] \end{aligned} \right.
⎩
⎨
⎧θ^(k+1)=[Xk+1TXk+1]−1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]−1[x(k+1)y(k+1)+XkTYk]对其中的求逆部分进行变换:
[
(
A
+
B
D
)
−
1
=
A
−
1
−
A
−
1
B
(
I
+
D
A
−
1
B
)
−
1
D
A
−
1
⇓
[
X
T
X
+
x
(
k
+
1
)
x
T
(
k
+
1
)
]
−
1
=
(
X
T
X
)
−
1
−
(
X
T
X
)
−
1
x
(
k
+
1
)
[
I
+
x
T
(
k
+
1
)
(
X
T
X
)
−
1
x
(
k
+
1
)
]
−
1
x
T
(
k
+
1
)
(
X
T
X
)
−
1
]
\left[ \begin{array}{c} {(A + BD)^{ - 1}} = {A^{ - 1}} - {A^{ - 1}}B{\left( {I + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ \Downarrow \\ {\left[ {{X^{\rm{T}}}X + {\bf{x}}(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]^{ - 1}} = {\left( {{X^{\rm{T}}}X} \right)^{ - 1}} - {\left( {{X^{\rm{T}}}X} \right)^{ - 1}}{\bf{x}}\left( {k + 1} \right){\left[ {I + {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){{\left( {{X^{\rm{T}}}X} \right)}^{ - 1}}{\bf{x}}\left( {k + 1} \right)} \right]^{ - 1}}{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){\left( {{X^{\rm{T}}}X} \right)^{ - 1}} \end{array} \right]
(A+BD)−1=A−1−A−1B(I+DA−1B)−1DA−1⇓[XTX+x(k+1)xT(k+1)]−1=(XTX)−1−(XTX)−1x(k+1)[I+xT(k+1)(XTX)−1x(k+1)]−1xT(k+1)(XTX)−1
定义
P
(
k
)
=
(
X
k
T
X
k
)
−
1
P\left( k \right) = {\left( {X_k^{\rm{T}}{X_k}} \right)^{ - 1}}
P(k)=(XkTXk)−1,由于
x
T
(
k
+
1
)
(
X
T
X
)
−
1
x
(
k
+
1
)
{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){\left( {{X^{\rm{T}}}X} \right)^{ - 1}}{\bf{x}}\left( {k + 1} \right)
xT(k+1)(XTX)−1x(k+1)为1×1的标量,基于上式可以得到:
{
P
(
k
+
1
)
=
[
X
k
+
1
T
X
k
+
1
]
−
1
P
(
k
+
1
)
=
[
X
T
X
+
x
(
k
+
1
)
x
T
(
k
+
1
)
]
−
1
=
(
X
T
X
)
−
1
−
(
X
T
X
)
−
1
x
(
k
+
1
)
[
1
+
x
T
(
k
+
1
)
(
X
T
X
)
−
1
x
(
k
+
1
)
]
−
1
x
T
(
k
+
1
)
(
X
T
X
)
−
1
=
P
(
k
)
−
P
(
k
)
x
(
k
+
1
)
x
T
(
k
+
1
)
1
+
x
T
(
k
+
1
)
P
(
k
)
x
(
k
+
1
)
P
(
k
)
=
[
I
−
K
(
k
+
1
)
x
T
(
k
+
1
)
]
P
(
k
)
,递归的第三个方程
\left\{ \begin{aligned} P\left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}\\ P\left( {k + 1} \right) &= {\left[ {{X^{\rm{T}}}X + {\bf{x}}(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]^{ - 1}}\\ &= {\left( {{X^{\rm{T}}}X} \right)^{ - 1}} - {\left( {{X^{\rm{T}}}X} \right)^{ - 1}}{\bf{x}}\left( {k + 1} \right){\left[ {1 + {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){{\left( {{X^{\rm{T}}}X} \right)}^{ - 1}}{\bf{x}}\left( {k + 1} \right)} \right]^{ - 1}}{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){\left( {{X^{\rm{T}}}X} \right)^{ - 1}}\\ &= P\left( k \right) - P\left( k \right)\frac{{{\bf{x}}(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)}}P\left( k \right)\\& = \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P(k),递归的第三个方程 \end{aligned} \right.
⎩
⎨
⎧P(k+1)P(k+1)=[Xk+1TXk+1]−1=[XTX+x(k+1)xT(k+1)]−1=(XTX)−1−(XTX)−1x(k+1)[1+xT(k+1)(XTX)−1x(k+1)]−1xT(k+1)(XTX)−1=P(k)−P(k)1+xT(k+1)P(k)x(k+1)x(k+1)xT(k+1)P(k)=[I−K(k+1)xT(k+1)]P(k),递归的第三个方程其中,
K
(
k
+
1
)
K(k + 1)
K(k+1)定义表达式为:
K
(
k
+
1
)
=
P
(
k
)
x
(
k
+
1
)
1
+
x
T
(
k
+
1
)
P
(
k
)
x
(
k
+
1
)
,递归的第二个方程
K(k + 1) = \frac{{P(k){\bf{x}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}},递归的第二个方程
K(k+1)=1+xT(k+1)P(k)x(k+1)P(k)x(k+1),递归的第二个方程 于是有:
{
θ
^
(
k
+
1
)
=
[
X
k
+
1
T
X
k
+
1
]
−
1
X
k
+
1
T
Y
k
+
1
=
[
X
k
T
X
k
+
x
(
k
+
1
)
x
T
(
k
+
1
)
]
−
1
[
x
(
k
+
1
)
y
(
k
+
1
)
+
X
k
T
Y
k
]
=
[
I
−
K
(
k
+
1
)
x
T
(
k
+
1
)
]
P
(
k
)
[
x
(
k
+
1
)
y
(
k
+
1
)
+
X
k
T
Y
k
]
\left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right]\\ &= \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P(k)\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right] \end{aligned} \right.
⎩
⎨
⎧θ^(k+1)=[Xk+1TXk+1]−1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]−1[x(k+1)y(k+1)+XkTYk]=[I−K(k+1)xT(k+1)]P(k)[x(k+1)y(k+1)+XkTYk]