最小二乘法,简明公式整理,数学证明,Matlab实现(自写代码、lsqcurvefit函数、fminsearch函数)

文章详细介绍了批处理最小二乘方法,递推最小二乘方法以及带有遗忘因子的递推最小二乘方法,通过数学公式和MATLAB案例展示了如何利用这些方法进行参数辨识和曲线拟合。递推最小二乘方法通过逐步更新降低计算复杂度,而遗忘因子则解决了数据饱和问题,确保算法能跟踪参数变化。MATLAB的lsqcurvefit和fminsearch函数也被提及作为非线性拟合的工具。
摘要由CSDN通过智能技术生成

批处理最小二乘方法

  考虑线性输入输出系统,其输出值 y y y可由输入状态 x \bf{x} x线性表示,数学模型描述为:
y = x T θ + ξ y = {\bf{x}} ^T\theta + \xi y=xTθ+ξ其中, ξ \xi ξ为系统白噪声; x {\bf{x}} x θ \theta θ为维度相同的列向量,分别表示为 x = [ x 1 x 2 ⋯ x n ] T {\bf{x}} = {\left[ {\begin{array}{c} {{x_1}}&{{x_2}}& \cdots &{{x_n}} \end{array}} \right]^T} x=[x1x2xn]T θ = [ c 1 c 2 ⋯ c n ] T \theta = {\left[{\begin{array}{c} {{c_1}}&{{c_2}}& \cdots &{{c_n}}\end{array}} \right]^T} θ=[c1c2cn]T

  当获取到 k k k组输入输出信息,所有这些信息可被用于参数辨识,记作输入矩阵 X ( k ) = [ x ( 1 ) x ( 2 ) ⋯ x ( k ) ] T X\left( k \right) = {\left[ {\begin{array}{c} {{\bf{x}}\left( 1 \right)}&{{\bf{x}}\left( 2 \right)}& \cdots &{{\bf{x}}\left( {k } \right)}\end{array}} \right]^T} X(k)=[x(1)x(2)x(k)]T,输出矩阵 Y ( k ) = [ y ( 1 ) y ( 2 ) ⋯ y ( k ) ] T Y\left( k \right) = {\left[ {\begin{array}{c}{y\left( 1 \right)}&{y\left( 2 \right)}& \cdots &{y\left( k \right)}\end{array}} \right]^T} Y(k)=[y(1)y(2)y(k)]T。参数辨识的本质是对 θ \theta θ进行最优估计 θ ^ \hat \theta θ^,从而实现:
min ⁡ J = ( X θ ^ − Y ) T ( X θ ^ − Y ) \min\quad J = {\left( {X\hat \theta - Y} \right)^T}\left( {X\hat \theta - Y} \right) minJ=(Xθ^Y)T(Xθ^Y)

  对上式求取 θ ^ \hat \theta θ^ 的偏导数并设置为零向量, ∂ J ∂ θ ^ = 2 X T X θ ^ − 2 X T Y = 0 \frac{{\partial J}}{{\partial \hat \theta }} = 2{X^T}X\hat \theta - 2{X^T}Y = {\bf{0}} θ^J=2XTXθ^2XTY=0。求解得到全局极值点,如下所示。由于 J J J的二阶偏导数 ∂ 2 J ∂ θ ^ 2 = 2 X T X \frac{{\partial^2J}}{{\partial \hat \theta^2 }} = 2{X^T}X θ^22J=2XTX为正定矩阵,该解必然为全局最小值。
θ ^ ( k ) = ( X T X ) − 1 X T Y \hat \theta \left( k \right) = {\left( {{X^T}X} \right)^{ - 1}}{X^T}Y θ^(k)=(XTX)1XTY

递推最小二乘方法

   随着 k k k的增大,上述批处理最小二乘方法需要占用大量的存储资源,且存在求解效率低、实时性差等缺点。递推形式的最小二乘参数辨识可以有效解决这些问题, θ ^ ( k ) \hat \theta \left( k \right) θ^(k)的递归形式可以被表示为:
{ θ ^ ( k ) = θ ^ ( k − 1 ) + K ( k ) [ y ( k ) − x T ( k ) θ ^ ( k − 1 ) ] K ( k ) = P ( k − 1 ) x ( k ) 1 + x T ( k ) P ( k − 1 ) x ( k ) P ( k ) = [ I − K ( k ) x T ( k ) ] P ( k − 1 ) \left\{ {\begin{array}{l} {\hat \theta \left( k \right) = \hat \theta \left( {k - 1} \right) + K\left( k \right)\left[ {y\left( k \right) - {{\bf{x}}^T}\left( {k} \right)\hat \theta \left( {k - 1} \right)} \right]}\\ {K\left( k \right) = \frac{{P\left( {k - 1} \right){\bf{x}}\left( {k } \right)}}{{1 + {{\bf{x}}^T}\left( {k } \right)P\left( {k - 1} \right){\bf{x}}\left( {k } \right)}}}\\ {P(k) = \left[ {I - K\left( k \right){{\bf{x}}^T}\left( {k } \right)} \right]P\left( {k - 1} \right)} \end{array}} \right. θ^(k)=θ^(k1)+K(k)[y(k)xT(k)θ^(k1)]K(k)=1+xT(k)P(k1)x(k)P(k1)x(k)P(k)=[IK(k)xT(k)]P(k1)其中, K ( k ) K\left( k \right) K(k)为增益向量; P ( k ) = ( X T X ) − 1 P\left( k \right) = {\left( {{X^T}X} \right)^{ - 1}} P(k)=(XTX)1表示误差的协方差矩阵,其初始值可由 ( X T X ) − 1 {\left( {{X^T}X} \right)^{ - 1}} (XTX)1给出或设置为 α I \alpha I αI I I I为单位矩阵, α = 1 0 6 ∼ 1 0 10 \alpha = {10^6} \sim {10^{10}} α=1061010

带有遗忘因子的递推最小二乘方法

  递推最小二乘参数辨识方法可以有效缓解存储与计算压力,但依然存在着数据饱和问题。这是最小二乘算法本身属性所决定的,算法中的所有数据被同等对待,这也就意味着新数据所带来的更新效果会被逐渐削弱。这种内在属性使得一般最小二乘算法难以收敛到参数真实值,且无法跟踪运行过程中的参数变化。为解决该问题,可以引入具有遗忘因子的最小二乘参数辨识方法,对数据的时序进行考虑,并将原始优化问题修改为:
min ⁡ J = ( X θ ^ − Y ) T W ( X θ ^ − Y ) \min\quad J = {\left( {X\hat \theta - Y} \right)^T}W\left( {X\hat \theta - Y} \right) minJ=(Xθ^Y)TW(Xθ^Y)其中, W = d i a g ( λ k − 1 , λ k − 2 , . . . , λ , 1 ) W = diag\left( {{\lambda ^{k - 1}},{\lambda ^{k - 2}},...,\lambda ,1} \right) W=diag(λk1,λk2,...,λ,1)为加权对角矩阵, λ \lambda λ为遗忘因子且 0.98 < λ < 1 0.98<\lambda<1 0.98<λ<1

  很容易理解,历史观测数据距离当前时刻越远,其所的占权重也会越低,参数辨识的过程会更依赖于近期加入的数据。对上式进行求解,可以得到具有遗忘因子的递推最小二乘参数辨识方法:
{ θ ^ ( k ) = θ ^ ( k − 1 ) + K ( k ) [ y ( k ) − x T ( k ) θ ^ ( k − 1 ) ] K ( k ) = P ( k − 1 ) x ( k ) λ + x T ( k ) P ( k − 1 ) x ( k ) P ( k ) = 1 λ [ I − K ( k ) x T ( k ) ] P ( k − 1 ) \left\{ {\begin{array}{l} {\hat \theta \left( k \right) = \hat \theta \left( {k - 1} \right) + K\left( k \right)\left[ {y\left( k \right) - {{\bf{x}}^T}\left( {k } \right)\hat \theta \left( {k - 1} \right)} \right]}\\ {K\left( k \right) = \frac{{P\left( {k - 1} \right){\bf{x}}\left( {k } \right)}}{{\lambda + {{\bf{x}}^T}\left( {k} \right)P\left( {k - 1} \right){\bf{x}}\left( {k} \right)}}}\\ {P\left( k \right) = \frac{1}{\lambda }\left[ {I - K\left( k \right){{\bf{x}}^T}\left( {k } \right)} \right]P\left( {k - 1} \right)} \end{array}} \right. θ^(k)=θ^(k1)+K(k)[y(k)xT(k)θ^(k1)]K(k)=λ+xT(k)P(k1)x(k)P(k1)x(k)P(k)=λ1[IK(k)xT(k)]P(k1)

其中,协方差矩阵 P ( k ) = ( X T W X ) − 1 P\left( k \right) = {\left( {{X^T}WX} \right)^{ - 1}} P(k)=(XTWX)1,其初始值的选取原则与一般递归最小而成方法保持一致。

Matlab案例分析

自写代码

  假设系统输出 y y y与时间 t t t满足如下关系表达式,并已知 t = 1 : 100 t = 1:100 t=1:100期间的 y y y值,求取系数 c 1 , c 2 , c 3 c_1,c_2,c_3 c1,c2,c3的最优表达式:
y = c 1 t + c 2 ( t − 60 ) 2 + c 3 cos ⁡ ( π t / 10 ) + ξ y = {c_1}t + {c_2}{\left( {t - 60} \right)^2} + {c_3}\cos \left( {\pi t/10} \right) + \xi y=c1t+c2(t60)2+c3cos(πt/10)+ξ
  通过定义 x 1 = t {x_1} = t x1=t x 2 = ( t − 60 ) 2 {x_2} = {\left( {t - 60} \right)^2} x2=(t60)2 x 3 = cos ⁡ ( π t / 10 ) {x_3} = \cos \left( {\pi t/10} \right) x3=cos(πt/10),我们完全可以把上述表达式转换为一个标准的线性表达式:
y = c 1 x 1 + c 2 x 2 + c 3 x 3 + ξ y = {c_1}{x_1} + {c_2}{x_2} + {c_3}{x_3} + \xi y=c1x1+c2x2+c3x3+ξ
  于是,我们可以通过递推最小二乘方法对系数进行求解,MATLAB代码如下:

clc; clear

% 构建三个输入项,一共101组
t = 0 : 1 : 100;
x1 = t;
x2 = (t - 60).^2;
x3 = cos(pi.*t / 10);

% 假设系统真实系数
c1 = 2.22;
c2 = 0.05;
c3 = 23.1;

% 得到系统输出,得到对应的101个输出(人为加入了一些噪声)
Y = c1 * x1 + c2 * x2 + c3 * x3  + randn(1,101);

% 绘制一下采样数据
plot(t, Y,'o')
hold on

% 设置初始值,执行递推过程
theta = [0;0;0];
Pk_ = 1e6 * eye(3);
lambda = 0.995;
for i = 1 : 1 : 100
    x = [x1(i); x2(i); x3(i)];
    y = Y(i);
    Kk = Pk_ * x / (lambda + x' * Pk_ * x);
    theta = theta + Kk * (y - x'*theta);
    Pk_ = (1/lambda)*(eye(3) - Kk * x') * Pk_;
end

% 基于最小二乘的结果绘制曲线,并于原始数据做对比
Y_Est = theta(1) * x1 + theta(2) * x2 + theta(3) * x3;

% 绘制出来拟合的曲线,并于采样点进行比较
plot(t, Y,'r')

  最终估算的参数向量theta = [2.2198; 0.0499; 23.0333],与系统真实参数 c 1 = 2.22 c1 = 2.22 c1=2.22 c 2 = 0.05 c2 = 0.05 c2=0.05 c 3 = 23.1 c3 = 23.1 c3=23.1是近似相等的。注意:由于随机噪声的存在。每次跑的结果可能会存在微小的差异。基于估算系数进行曲线拟合,并对比采样数据,结果如下:matlab拟合结果

matlab之lsqcurvefit函数

  如果只是简单应用,是可以直接调用matlab库函数的。lsqcurvefit函数基于最小二乘方法,进行曲线拟合,该函数内部进行了很多的优化以满足对非线性拟合的需要。

clc; clear;

% 生成一组模拟数据
x = linspace(-10, 10, 100);
y = 2*x.^2 - 3*sin(x) + 0.5*randn(size(x));

% 定义要拟合的函数形式,包括二次项和三角函数项
f = @(p,x) p(1)*x.^2 + p(2)*sin(p(3)*x) + p(4);

% 定义初始参数矩阵,其中包含二次项系数、三角函数的振幅、频率和常数项
p0 = [1, 1, 1, 1]; % 对应真实值为[2, -3, 1, 0]

% 使用最小二乘法拟合曲线,并返回最优参数和拟合误差
[p, resnorm] = lsqcurvefit(f, p0, x, y);

% 生成拟合曲线并绘制
y_fit = f(p, x);
plot(x, y, 'o', x, y_fit, '-')
legend('原始数据', '拟合曲线')

  lsqcurvefit的输入有四项,分别为待拟合函数、待拟合函数参数向量初始值、待拟合函数输入向量、采样结果,且待拟合函数定义的时候,必须将 x x x也作为输入的一部分,仿真结果如下所示。感觉用起来还挺费劲的,相比较而言,更推荐下面的fminsearch函数,比较直观一些。
在这里插入图片描述

matlab之fminsearch函数

  相比较而言,fminsearch函数更为直观一些,因为其输入只需要包含两项:目标函数、参数向量初始值。而目标函数直接对所有误差进行了包含,只保留了参数向量作为输入项,或者说待优化项。代码如下所示:

clc; clear;

% Generate simulated data
x = linspace(0, 2*pi, 50)';
y = 2*sin(2*x) + 0.5*x.^2 + randn(size(x));

% Define error function
fun = @(c) sum((y - (c(1)*sin(c(2)*x) + c(3)*x.^2 + c(4))).^2);

% Find least-squares solution
c0 = [1 2 1 1]; % 对应真实值为[2 2 0.5 0]
c = fminsearch(fun, c0);

% Plot results
xfit = linspace(0, 2*pi, 100)';
yfit = c(1)*sin(c(2)*xfit) + c(3)*xfit.^2 + c(4);
plot(x, y, 'ko', xfit, yfit, 'b-');
legend('Data', 'Fit');

  仿真结果如下所示:
在这里插入图片描述

附录1:递推最小二乘的数学证明

{ θ ^ ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 X k + 1 T Y k + 1 = [ X k T X k + x ( k + 1 ) x T ( k + 1 ) ] − 1 [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] ,见附录 1.1 = [ I − K ( k + 1 ) x T ( k + 1 ) ] P ( k ) [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] ,见附录 1.2 = [ I − K ( k + 1 ) x T ( k + 1 ) ] [ P ( k ) x ( k + 1 ) y ( k + 1 ) + θ ^ ( k ) ] = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + [ I − K ( k + 1 ) x T ( k + 1 ) ] P ( k ) x ( k + 1 ) y ( k + 1 ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + [ I − P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) x T ( k + 1 ) ] P ( k ) x ( k + 1 ) y ( k + 1 ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) [ 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) ] − P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] θ ^ ( k ) + K ( k + 1 ) y ( k + 1 ) [ 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) ] − K ( k + 1 ) x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) = θ ^ ( k ) − K ( k + 1 ) x T ( k + 1 ) θ ^ ( k ) + K ( k + 1 ) y ( k + 1 ) + K ( k + 1 ) [ x T ( k + 1 ) P ( k ) x ( k + 1 ) ] 1 × 1 y ( k + 1 ) − K ( k + 1 ) x T ( k + 1 ) P ( k ) x ( k + 1 ) y ( k + 1 ) = θ ^ ( k ) − K ( k + 1 ) x T ( k + 1 ) θ ^ ( k ) + K ( k + 1 ) y ( k + 1 ) = θ ^ ( k ) + K ( k + 1 ) [ y ( k + 1 ) − x T ( k + 1 ) θ ^ ( k ) ] \left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right],见附录1.1\\ &= \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P(k)\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right],见附录1.2\\ &= \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\left[ {P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + \hat \theta \left( k \right)} \right]\\ &{\rm{ = }}\left[ {I - K\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}\left[ {I - \frac{{P(k){\bf{x}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}}{{\bf{x}}^{\rm{T}}}(k + 1)} \right]P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}\frac{{P\left( k \right){\bf{x}}\left( {k + 1} \right)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}}y\left( {k + 1} \right)\left[ {1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)} \right] - \frac{{P(k){\bf{x}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}}{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]\hat \theta \left( k \right){\rm{ + }}K\left( {k + 1} \right)y\left( {k + 1} \right)\left[ {1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)} \right] - K\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\& {\rm{ = }}\hat \theta \left( k \right) - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)\hat \theta \left( k \right){\rm{ + }}K\left( {k + 1} \right)y\left( {k + 1} \right) + K\left( {k + 1} \right){\left[ {{{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)} \right]_{1 \times 1}}y\left( {k + 1} \right) - K\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right)\\ &{\rm{ = }}\hat \theta \left( k \right) - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)\hat \theta \left( k \right){\rm{ + }}K\left( {k + 1} \right)y\left( {k + 1} \right)\\ &{\rm{ = }}\hat \theta \left( k \right) + K(k + 1)\left[ {y\left( {k + 1} \right) - {{\bf{x}}^{\rm{T}}}(k + 1)\hat \theta \left( k \right)} \right] \end{aligned} \right. θ^(k+1)=[Xk+1TXk+1]1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]1[x(k+1)y(k+1)+XkTYk],见附录1.1=[IK(k+1)xT(k+1)]P(k)[x(k+1)y(k+1)+XkTYk],见附录1.2=[IK(k+1)xT(k+1)][P(k)x(k+1)y(k+1)+θ^(k)]=[IK(k+1)xT(k+1)]θ^(k)+[IK(k+1)xT(k+1)]P(k)x(k+1)y(k+1)=[IK(k+1)xT(k+1)]θ^(k)+[I1+xT(k+1)P(k)x(k+1)P(k)x(k+1)xT(k+1)]P(k)x(k+1)y(k+1)=[IK(k+1)xT(k+1)]θ^(k)+1+xT(k+1)P(k)x(k+1)P(k)x(k+1)y(k+1)[1+xT(k+1)P(k)x(k+1)]1+xT(k+1)P(k)x(k+1)P(k)x(k+1)xT(k+1)P(k)x(k+1)y(k+1)=[IK(k+1)xT(k+1)]θ^(k)+K(k+1)y(k+1)[1+xT(k+1)P(k)x(k+1)]K(k+1)xT(k+1)P(k)x(k+1)y(k+1)=θ^(k)K(k+1)xT(k+1)θ^(k)+K(k+1)y(k+1)+K(k+1)[xT(k+1)P(k)x(k+1)]1×1y(k+1)K(k+1)xT(k+1)P(k)x(k+1)y(k+1)=θ^(k)K(k+1)xT(k+1)θ^(k)+K(k+1)y(k+1)=θ^(k)+K(k+1)[y(k+1)xT(k+1)θ^(k)]

附录1.1:数学证明步骤一

  根据批处理最小二乘方法:
θ ^ ( k ) = ( X k T X k ) − 1 X k T Y k \hat \theta \left( k \right) = {\left( {X_k^T{X_k}} \right)^{ - 1}}X_k^T{Y_k} θ^(k)=(XkTXk)1XkTYk  进一步,在下一刻的估计可以表示为:
θ ^ ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 X k + 1 T Y k + 1 \hat \theta \left( {k + 1} \right) = {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}} θ^(k+1)=[Xk+1TXk+1]1Xk+1TYk+1其中,
X k + 1 = [ X k x T ( k + 1 ) ] , Y k + 1 = [ Y k y ( k + 1 ) ] {X_{k + 1}} = \left[ {\begin{array}{c} {{X_k}}\\ {{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \end{array}} \right],{Y_{k + 1}} = \left[ {\begin{array}{c} {{Y_k}}\\ {y\left( {k + 1} \right)} \end{array}} \right] Xk+1=[XkxT(k+1)],Yk+1=[Yky(k+1)]  于是有
{ θ ^ ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 X k + 1 T Y k + 1 = [ X k T X k + x ( k + 1 ) x T ( k + 1 ) ] − 1 [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] \left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right] \end{aligned} \right. θ^(k+1)=[Xk+1TXk+1]1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]1[x(k+1)y(k+1)+XkTYk]

附录1.2:数学证明步骤二

  为了证明这一步,首先介绍一个数学引理,这个稍后会用到。
【引理】设 A A A C C C A + B C D A+BCD A+BCD均为非奇异方阵,则有
{ ( A + B C D ) − 1 = A − 1 − A − 1 B ( C − 1 + D A − 1 B ) − 1 D A − 1 特例 ( C = I ) , ( A + B D ) − 1 = A − 1 − A − 1 B ( I + D A − 1 B ) − 1 D A − 1 \left\{ \begin{array}{l} {(A + BCD)^{ - 1}} = {A^{ - 1}} - {A^{ - 1}}B{\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ 特例(C = I),{(A + BD)^{ - 1}} = {A^{ - 1}} - {A^{ - 1}}B{\left( {I + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}} \end{array} \right. {(A+BCD)1=A1A1B(C1+DA1B)1DA1特例(C=I)(A+BD)1=A1A1B(I+DA1B)1DA1  该引理的证明过程如下:
{ ( A + B C D ) [ A − 1 − A − 1 B ( C − 1 + D A − 1 B ) − 1 D A − 1 ] = I + B C D A − 1 − B ( C − 1 + D A − 1 B ) D A − 1 − B C D A − 1 B ( C − 1 + D A − 1 B ) − 1 D A − 1 = I + B C D A − 1 − B ( I + C D A − 1 B ) ( C − 1 + D A − 1 B ) − 1 D A − 1 = I + B C D A − 1 − B C ( C − 1 + D A − 1 B ) ( C − 1 + D A − 1 B ) − 1 D A − 1 = I \left\{ \begin{array}{l} (A + BCD)\left[ {{A^{ - 1}} - {A^{ - 1}}B{{\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)}^{ - 1}}D{A^{ - 1}}} \right]\\ = I + BCD{A^{ - 1}} - B\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)D{A^{ - 1}} - BCD{A^{ - 1}}B{\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ = I + BCD{A^{ - 1}} - B\left( {I + CD{A^{ - 1}}B} \right){\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ = I + BCD{A^{ - 1}} - BC\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right){\left( {{C^{ - 1}} + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ = I \end{array} \right. (A+BCD)[A1A1B(C1+DA1B)1DA1]=I+BCDA1B(C1+DA1B)DA1BCDA1B(C1+DA1B)1DA1=I+BCDA1B(I+CDA1B)(C1+DA1B)1DA1=I+BCDA1BC(C1+DA1B)(C1+DA1B)1DA1=I
  上面已经证明到:
{ θ ^ ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 X k + 1 T Y k + 1 = [ X k T X k + x ( k + 1 ) x T ( k + 1 ) ] − 1 [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] \left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right] \end{aligned} \right. θ^(k+1)=[Xk+1TXk+1]1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]1[x(k+1)y(k+1)+XkTYk]对其中的求逆部分进行变换:
[ ( A + B D ) − 1 = A − 1 − A − 1 B ( I + D A − 1 B ) − 1 D A − 1 ⇓ [ X T X + x ( k + 1 ) x T ( k + 1 ) ] − 1 = ( X T X ) − 1 − ( X T X ) − 1 x ( k + 1 ) [ I + x T ( k + 1 ) ( X T X ) − 1 x ( k + 1 ) ] − 1 x T ( k + 1 ) ( X T X ) − 1 ] \left[ \begin{array}{c} {(A + BD)^{ - 1}} = {A^{ - 1}} - {A^{ - 1}}B{\left( {I + D{A^{ - 1}}B} \right)^{ - 1}}D{A^{ - 1}}\\ \Downarrow \\ {\left[ {{X^{\rm{T}}}X + {\bf{x}}(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]^{ - 1}} = {\left( {{X^{\rm{T}}}X} \right)^{ - 1}} - {\left( {{X^{\rm{T}}}X} \right)^{ - 1}}{\bf{x}}\left( {k + 1} \right){\left[ {I + {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){{\left( {{X^{\rm{T}}}X} \right)}^{ - 1}}{\bf{x}}\left( {k + 1} \right)} \right]^{ - 1}}{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){\left( {{X^{\rm{T}}}X} \right)^{ - 1}} \end{array} \right] (A+BD)1=A1A1B(I+DA1B)1DA1[XTX+x(k+1)xT(k+1)]1=(XTX)1(XTX)1x(k+1)[I+xT(k+1)(XTX)1x(k+1)]1xT(k+1)(XTX)1

  定义 P ( k ) = ( X k T X k ) − 1 P\left( k \right) = {\left( {X_k^{\rm{T}}{X_k}} \right)^{ - 1}} P(k)=(XkTXk)1,由于 x T ( k + 1 ) ( X T X ) − 1 x ( k + 1 ) {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){\left( {{X^{\rm{T}}}X} \right)^{ - 1}}{\bf{x}}\left( {k + 1} \right) xT(k+1)(XTX)1x(k+1)为1×1的标量,基于上式可以得到:
{ P ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 P ( k + 1 ) = [ X T X + x ( k + 1 ) x T ( k + 1 ) ] − 1 = ( X T X ) − 1 − ( X T X ) − 1 x ( k + 1 ) [ 1 + x T ( k + 1 ) ( X T X ) − 1 x ( k + 1 ) ] − 1 x T ( k + 1 ) ( X T X ) − 1 = P ( k ) − P ( k ) x ( k + 1 ) x T ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) P ( k ) = [ I − K ( k + 1 ) x T ( k + 1 ) ] P ( k ) ,递归的第三个方程 \left\{ \begin{aligned} P\left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}\\ P\left( {k + 1} \right) &= {\left[ {{X^{\rm{T}}}X + {\bf{x}}(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]^{ - 1}}\\ &= {\left( {{X^{\rm{T}}}X} \right)^{ - 1}} - {\left( {{X^{\rm{T}}}X} \right)^{ - 1}}{\bf{x}}\left( {k + 1} \right){\left[ {1 + {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){{\left( {{X^{\rm{T}}}X} \right)}^{ - 1}}{\bf{x}}\left( {k + 1} \right)} \right]^{ - 1}}{{\bf{x}}^{\rm{T}}}\left( {k + 1} \right){\left( {{X^{\rm{T}}}X} \right)^{ - 1}}\\ &= P\left( k \right) - P\left( k \right)\frac{{{\bf{x}}(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)P\left( k \right){\bf{x}}\left( {k + 1} \right)}}P\left( k \right)\\& = \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P(k),递归的第三个方程 \end{aligned} \right. P(k+1)P(k+1)=[Xk+1TXk+1]1=[XTX+x(k+1)xT(k+1)]1=(XTX)1(XTX)1x(k+1)[1+xT(k+1)(XTX)1x(k+1)]1xT(k+1)(XTX)1=P(k)P(k)1+xT(k+1)P(k)x(k+1)x(k+1)xT(k+1)P(k)=[IK(k+1)xT(k+1)]P(k),递归的第三个方程其中, K ( k + 1 ) K(k + 1) K(k+1)定义表达式为:
K ( k + 1 ) = P ( k ) x ( k + 1 ) 1 + x T ( k + 1 ) P ( k ) x ( k + 1 ) ,递归的第二个方程 K(k + 1) = \frac{{P(k){\bf{x}}(k + 1)}}{{1 + {{\bf{x}}^{\rm{T}}}(k + 1)P(k){\bf{x}}(k + 1)}},递归的第二个方程 K(k+1)=1+xT(k+1)P(k)x(k+1)P(k)x(k+1),递归的第二个方程  于是有:
{ θ ^ ( k + 1 ) = [ X k + 1 T X k + 1 ] − 1 X k + 1 T Y k + 1 = [ X k T X k + x ( k + 1 ) x T ( k + 1 ) ] − 1 [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] = [ I − K ( k + 1 ) x T ( k + 1 ) ] P ( k ) [ x ( k + 1 ) y ( k + 1 ) + X k T Y k ] \left\{ \begin{aligned} \hat \theta \left( {k + 1} \right) &= {\left[ {X_{k + 1}^{\rm{T}}{X_{k + 1}}} \right]^{ - 1}}X_{k + 1}^{\rm{T}}{Y_{k + 1}}\\ &= {\left[ {X_k^{\rm{T}}{X_k} + {\bf{x}}\left( {k + 1} \right){{\bf{x}}^{\rm{T}}}\left( {k + 1} \right)} \right]^{ - 1}}\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right]\\ &= \left[ {I - K(k + 1){{\bf{x}}^{\rm{T}}}(k + 1)} \right]P(k)\left[ {{\bf{x}}\left( {k + 1} \right)y\left( {k + 1} \right) + X_k^{\rm{T}}{Y_k}} \right] \end{aligned} \right. θ^(k+1)=[Xk+1TXk+1]1Xk+1TYk+1=[XkTXk+x(k+1)xT(k+1)]1[x(k+1)y(k+1)+XkTYk]=[IK(k+1)xT(k+1)]P(k)[x(k+1)y(k+1)+XkTYk]

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值