Kalman滤波(Part-2:标量形式)

本文深入探讨了卡尔曼滤波的基本原理,包括正交、不相关和独立的概念。通过标量形式的卡尔曼滤波推导,展示了如何利用预测和校正步骤进行状态估计。内容涵盖了预测误差、最小预测均方误差、卡尔曼增益以及最终的最小均方误差,揭示了卡尔曼滤波在处理高斯随机过程中的核心思想和计算流程。
摘要由CSDN通过智能技术生成

在开始之前,先回顾一下正交、不相关和独立之间的联系与差别

  • 正交
    随机变量: R ( x , y ) = E [ x y ] \mathcal R(x, y) = \mathbb E[xy] R(x,y)=E[xy]为相关函数,若 R ( x y ) = 0 \mathcal R(xy)=0 R(xy)=0,则认为 x , y x,y x,y正交。(类比内积,注意,相关函数为0,是正交,不是不相关)
    随机过程: R ( X ( t ) , Y ( t ) ) = E [ X ( t ) Y ( t ) ] \mathcal R(X(t), Y(t)) = \mathbb E[X(t)Y(t)] R(X(t),Y(t))=E[X(t)Y(t)],若 R ( X ( t ) , Y ( t ) ) = 0 \mathcal R(X(t), Y(t)) =0 R(X(t),Y(t))=0,则认为 X ( t ) , Y ( t ) X(t), Y(t) X(t),Y(t)正交。

  • 不相关
    随机变量: E [ x y ] = E [ x ] E [ y ] \mathbb E[xy] = \mathbb E [x] \mathbb E[y] E[xy]=E[x]E[y],则认为 x , y x,y x,y不相关。
    随机过程: E [ X ( t ) Y ( t ) ] = E [ X ( t ) ] E [ Y ( t ) ] \mathbb E[X(t)Y(t)] = \mathbb E [X(t)] \mathbb E[Y(t)] E[X(t)Y(t)]=E[X(t)]E[Y(t)],则认为 X ( t ) , Y ( t ) X(t), Y(t) X(t),Y(t)不相关。
    注意:当随机变量为高斯随机变量,或随机过程为高斯随机过程时,不相关与独立等价。

  • 独立
    若联合分布 p ( x , y ) = p ( x ) ⋅ p ( y ) p(x,y)=p(x) \cdot p(y) p(x,y)=p(x)p(y),则认为 x , y x,y x,y独立。

  • 协方差的相关和独立
    协方差函数 Cov ( x , y ) = E [ ( x − E [ x ] ) ( y − E [ y ] ) ] \text{Cov}(x,y) = \mathbb E\left [ (x - \mathbb E[x])(y - \mathbb E[y]) \right] Cov(x,y)=E[(xE[x])(yE[y])],若 Cov ( x , y ) = 0 \text{Cov}(x,y) = 0 Cov(x,y)=0,则称 x , y x,y x,y不相关(不相关只是说明两者没有线性关系,但是不代表有任何关系

正交、不相关与独立之间的关系:

  • 独立 ⇒ \Rightarrow 不相关
  • 高斯随机变量时,独立 ⇔ \Leftrightarrow 不相关
  • 当其中一个变量的均值为0时,不相关 ⇔ \Leftrightarrow 正交,否则没关系

Kalman滤波:标量形式

考虑标量的状态方程(scalar state equation)和标量观测方程(scalar observation equation):
s [ n ] = a s [ n − 1 ] + u [ n ] (1) s[n] = a s[n-1] + u[n] \tag{1} s[n]=as[n1]+u[n](1)

x [ n ] = s [ n ] + w [ n ] (2) x[n] = s[n] + w[n] \tag{2} x[n]=s[n]+w[n](2)

其中,我们假设 s [ − 1 ] ∼ N ( μ s , σ s ) s[-1] \sim \mathcal{N}(\mu_s,\sigma_s) s[1]N(μs,σs) u [ n ] u[n] u[n]是零均值的高斯噪声, E [ u 2 [ n ] ] = σ u 2 \mathbb{E}[u^2[n]]=\sigma_u^2 E[u2[n]]=σu2,且 { u [ n ] } \{u[n]\} {u[n]}之间相互独立。 w [ n ] w[n] w[n]是零均值的高斯噪声, E [ w 2 [ n ] ] = σ n 2 \mathbb{E}[w^2[n]]=\sigma_n^2 E[w2[n]]=σn2,且 { w [ n ] } \{w[n]\} {w[n]}之间相互独立。为了简化过程,我们假设 μ s = 0 \mu_s=0 μs=0。我们要从观测值 { x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] } \{x[0],x[1],\cdots,x[n]\} {x[0],x[1],,x[n]}中估计出 s [ n ] s[n] s[n]。我们指定基于 { x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] } \{x[0],x[1],\cdots,x[n]\} {x[0],x[1],,x[n]}来估计 s [ n ] s[n] s[n]的估计器为 s ^ [ n ∣ m ] \hat{s}[n|m] s^[nm]。我们的最优准则(criterion of optimality)基于最小化贝叶斯MSE(minimum Bayes MSE),用公式表示为
E [ ( s [ n ] − s ^ [ n ∣ n ] ) 2 ] \mathbb{E} \left [ (s[n] - \hat{s}[n|n])^2 \right] E[(s[n]s^[nn])2]

求该期望所对应的概率为联合概率密度函数 p ( x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] , s [ n ] ) p(x[0],x[1],\cdots,x[n],s[n]) p(x[0],x[1],,x[n],s[n]) (在这一点上要区别于经典的MSE,经典的MSE与Bayes-MSE区别在于如何看待 s [ n ] s[n] s[n]:经典的MSE是把 s [ n ] s[n] s[n]看作是一个未知的参数,所以MSE求期望的是基于 p ( x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] ; s [ n ] ) p(x[0],x[1],\cdots,x[n];s[n]) p(x[0],x[1],,x[n];s[n]);而Bayes-MSE把 s [ n ] s[n] s[n]看作是一个随机变量。)

MMSE估计器是后验均值:
s ^ [ n ∣ n ] = E [ s [ n ] ∣ x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] ] (3) \hat{s}[n|n] = \mathbb{E} \left [ s[n]| x[0],x[1],\cdots, x[n] \right] \tag{3} s^[nn]=E[s[n]x[0],x[1],,x[n]](3)

θ = s [ n ] \theta=s[n] θ=s[n] x = [ x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] ] T \boldsymbol{x} = [x[0],x[1],\cdots,x[n]]^T x=[x[0],x[1],,x[n]]T是联合高斯的,所以有
s ^ [ n ∣ n ] = C θ x C x x − 1 x (4) \hat{s} [n|n] = \boldsymbol C_{\theta x} \boldsymbol C^{-1}_{x x} \boldsymbol{ x} \tag{4} s^[nn]=CθxCxx1x(4)

因为我们假设的统计特征都是基于高斯的,所以MMSE估计器是线性的,也就与LMMSE估计器一致。

关于MMSE估计器:估计 θ \theta θ,我们给出两个性质:

  • 性质1:基于两个不相关数据向量 x 1 , x 2 \boldsymbol{x}_1,\boldsymbol{x}_2 x1,x2,假设他们服从联合高斯分布,那么
    θ ^ = E [ θ ∣ x 1 , x 2 ] = E [ θ ∣ x 1 ] + E [ θ ∣ x 2 ] \begin{aligned} \hat{\theta} & = \mathbb{E} \left [ \theta| \boldsymbol{x}_1,\boldsymbol{x}_2 \right] \\ &= \mathbb{E} \left [ \theta| \boldsymbol{x}_1 \right] + \mathbb{E} \left [ \theta| \boldsymbol{x}_2 \right] \end{aligned} θ^=E[θx1,x2]=E[θx1]+E[θx2]关于该性质,我们做出两种证明或解释,如下所述:
    解释1:因为 x = [ x 1 T , x 2 T ] T \boldsymbol{x} = [\boldsymbol{x}_1^T, \boldsymbol{x}_2^T]^T x=[x1T,x2T]T服从高斯分布,所以
    θ ^ = E [ θ ∣ x ] = E [ θ ] + C θ x C x x − 1 ( x − E [ x ] ) = C θ x C x x − 1 x \begin{aligned} \hat{\theta} = \mathbb{E}[\theta|\boldsymbol x] &= \mathbb{E}[\theta] + \boldsymbol C_{\theta x} \boldsymbol C^{-1}_{x x} (\boldsymbol x - \mathbb{E}[\boldsymbol x]) \\ &= \boldsymbol C_{\theta x} \boldsymbol C^{-1}_{x x} \boldsymbol x \end{aligned} θ^=E[θx]=E[θ]+CθxCxx1(xE[x])=CθxCxx1x因为我们假设 E [ θ ] = 0 \mathbb{E}[\theta]=0 E[θ]=0 E [ x ] = 0 \mathbb{E}[\boldsymbol x]=0 E[x]=0,这样的假设是合理的,因为我们可以在开始处理之前先减掉均值。
    考虑到 x 1 , x 2 \boldsymbol{x}_1,\boldsymbol{x}_2 x1,x2不相关,且 E [ x 1 ] = E [ x 2 ] = 0 \mathbb{E}[\boldsymbol x_1]=\mathbb{E}[\boldsymbol x_2]=\boldsymbol{0} E[x1]=E[x2]=0,所以 E [ x 1 x 2 T ] = E [ x 1 ] E [ x 2 T ] = 0 \mathbb{E}[\boldsymbol x_1 \boldsymbol{x}^T_2] = \mathbb{E}[\boldsymbol x_1] \mathbb{E}[\boldsymbol{x}^T_2]=\boldsymbol{0} E[x1x2T]=E[x1]E[x2T]=0,因此可以得到,
    C x x − 1 = [ C x 1 x 1 C x 1 x 2 C x 2 x 1 C x 2 x 2 ] − 1 = [ C x 1 x 1 0 0 C x 2 x 2 ] − 1 = [ C x 1 x 1 − 1 0 0 C x 2 x 2 − 1 ] \begin{aligned} \boldsymbol{C}_{xx}^{-1}&=\left[ \begin{matrix} {\boldsymbol{C}_{\boldsymbol{x}_{\boldsymbol{1}}}}_{\boldsymbol{x}_{\boldsymbol{1}}}& {\boldsymbol{C}_{\boldsymbol{x}_{\boldsymbol{1}}}}_{\boldsymbol{x}_2}\\ {\boldsymbol{C}_{\boldsymbol{x}_2}}_{\boldsymbol{x}_{\boldsymbol{1}}}& {\boldsymbol{C}_{\boldsymbol{x}_2}}_{\boldsymbol{x}_2}\\ \end{matrix} \right] ^{-1} \\ &=\left[ \begin{matrix} {\boldsymbol{C}_{\boldsymbol{x}_{\boldsymbol{1}}}}_{\boldsymbol{x}_{\boldsymbol{1}}}& \boldsymbol{0}\\ \boldsymbol{0}& {\boldsymbol{C}_{\boldsymbol{x}_2}}_{\boldsymbol{x}_2}\\ \end{matrix} \right] ^{-1} \\ &=\left[ \begin{matrix} \boldsymbol{C}_{\boldsymbol{x}_{\boldsymbol{1}}\boldsymbol{x}_{\boldsymbol{1}}}^{-1}& \boldsymbol{0}\\ \boldsymbol{0}& \boldsymbol{C}_{\boldsymbol{x}_2\boldsymbol{x}_2}^{-1}\\ \end{matrix} \right] \end{aligned} Cxx1=[Cx1x1Cx2x1Cx1x2Cx2x2]1=[Cx1x100Cx2x2]1=[Cx1x1100Cx2x21]并且,
    C θ x = E [ θ [ x 1 x 2 ] T ] = [ C θ x 1 C θ x 2 ] \boldsymbol C_{\theta x} = \mathbb{E} \left[ \boldsymbol{\theta }\left[ \begin{array}{c} \boldsymbol{x}_1\\ \boldsymbol{x}_2\\ \end{array} \right] ^T \right] =\left[ \begin{matrix} \boldsymbol{C}_{\boldsymbol{\theta x}_1}& \boldsymbol{C}_{\boldsymbol{\theta x}_2}\\ \end{matrix} \right] Cθx=E[θ[x1x2]T]=[Cθx1Cθx2]因此,
    θ = [ C θ x 1 C θ x 2 ] [ C x 1 x 1 − 1 0 0 C x 2 x 2 − 1 ] [ x 1 x 2 ] = C θ x 1 C x 1 x 1 − 1 x 1 + C θ x 2 C x 2 x 2 − 1 x 2 = E [ θ ∣ x 1 ] + E [ θ ∣ x 2 ] \begin{aligned} \boldsymbol \theta &= \left[ \begin{matrix} \boldsymbol{C}_{\boldsymbol{\theta x}_1}& \boldsymbol{C}_{\boldsymbol{\theta x}_2}\\ \end{matrix} \right] \left[ \begin{matrix} \boldsymbol{C}_{\boldsymbol{x}_{\boldsymbol{1}}\boldsymbol{x}_{\boldsymbol{1}}}^{-1}& \boldsymbol{0}\\ \boldsymbol{0}& \boldsymbol{C}_{\boldsymbol{x}_2\boldsymbol{x}_2}^{-1}\\ \end{matrix} \right] \left[ \begin{array}{c} \boldsymbol{x}_1\\ \boldsymbol{x}_2\\ \end{array} \right] \\ & = \boldsymbol{C}_{\boldsymbol{\theta x}_1} \boldsymbol{C}_{\boldsymbol{x}_{\boldsymbol{1}}\boldsymbol{x}_{\boldsymbol{1}}}^{-1} \boldsymbol x_1 + \boldsymbol{C}_{\boldsymbol{\theta x}_2} \boldsymbol{C}_{\boldsymbol{x}_2\boldsymbol{x}_2}^{-1} \boldsymbol x_2 \\ & = \mathbb{E} \left [ \theta| \boldsymbol{x}_1 \right] + \mathbb{E} \left [ \theta| \boldsymbol{x}_2 \right] \end{aligned} θ=[Cθx1Cθx2][Cx1x1100Cx2x21][x1x2]=Cθx1Cx1x11x1+Cθx2Cx2x21x2=E[θx1]+E[θx2]解释2:从线性空间的角度来看,应该会比较形象,因为 E [ x 1 x 2 T ] = E [ x 1 ] E [ x 2 T ] = 0 \mathbb{E}[\boldsymbol x_1 \boldsymbol{x}^T_2] = \mathbb{E}[\boldsymbol x_1] \mathbb{E}[\boldsymbol{x}^T_2]=\boldsymbol{0} E[x1x2T]=E[x1]E[x2T]=0,我们知道 x 1 \boldsymbol{x}_1 x1 x 2 \boldsymbol{x}_2 x2是相互正交的,所以可以表征为各自估计的结果的和。

  • 性质2:MMSE估计器是可加的,如果 θ = θ 1 + θ 2 \theta = \theta_1 + \theta_2 θ=θ1+θ2,那么
    θ ^ = E [ θ ∣ x ] = E [ θ 1 + θ 2 ∣ x ] = E [ θ 1 ∣ x ] + E [ θ 2 ∣ x ] \begin{aligned} \hat{\theta} &= \mathbb{E}[\theta|\boldsymbol x] \\ &= \mathbb{E}[\theta_1+\theta_2|\boldsymbol x] \\ & = \mathbb{E}[\theta_1|\boldsymbol x] + \mathbb{E}[\theta_2|\boldsymbol x] \end{aligned} θ^=E[θx]=E[θ1+θ2x]=E[θ1x]+E[θ2x]

在描述完两个性质后,我们令 X [ n ] = [ x [ 0 ] , x [ 1 ] , ⋯   , x [ n ] ] T \boldsymbol{ X}[n] = [x[0],x[1],\cdots,x[n]]^T X[n]=[x[0],x[1],,x[n]]T,令 x ~ [ n ] \tilde{x}[n] x~[n]为innovation(The innovation is the part of x [ n ] x[n] x[n] that is uncorrelated with the previous samples { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}):
x ~ [ n ] = x [ n ] − x ^ [ n ∣ n − 1 ] (5) \tilde {x}[n] = x[n] - \hat{x}[n|n-1] \tag{5} x~[n]=x[n]x^[nn1](5)

这里我想强调一下为什么 x ~ [ n ] \tilde{x}[n] x~[n] { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}不相关,因为 x ^ [ n ∣ n − 1 ] \hat{x}[n|n-1] x^[nn1]是基于观测数据 { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}所做的关于 x [ n ] x[n] x[n]的MMSES估计,根据正交原理:估计误差 x ~ [ n ] \tilde{ x}[n] x~[n]与观测数据的线性组合(这里为数据本身)正交,所以得到 x ~ [ n ] \tilde{x}[n] x~[n] { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}不相关。事实上,我们可以把 X [ n ] \boldsymbol{X}[n] X[n] x ~ [ n ] \tilde{x}[n] x~[n]等效为集合 { x [ 0 ] , ⋯   , x [ n − 1 ] , x [ n ] } \{x[0],\cdots,x[n-1],x[n]\} {x[0],,x[n1],x[n]},因为 x [ n ] x[n] x[n]可以被恢复为:
x [ n ] = x ~ [ n ] + x ^ [ n ∣ n − 1 ] = x ~ [ n ] + ∑ k = 0 n − 1 a k x [ k ] \begin{aligned} x[n] &= \tilde {x}[n] + \hat{x}[n|n-1] \\ &= \tilde {x}[n] + \sum_{k=0}^{n-1} a_k x[k] \end{aligned} x[n]=x~[n]+x^[nn1]=x~[n]+k=0n1akx[k]

其中 a k a_k ak是MMSE估计器对应的相关系数,我们可以把式(3)写为:
s ^ [ n ∣ n ] = E [ s [ n ] ∣ X [ n − 1 ] , x ~ [ n ] ] \hat{s}[n|n] = \mathbb{E} \left [ s[n] | \boldsymbol X[n-1], \tilde x[n] \right] s^[nn]=E[s[n]X[n1],x~[n]]

又因为 X [ n − 1 ] \boldsymbol{X}[n-1] X[n1] x ~ [ n ] \tilde{x}[n] x~[n]不相关,根据性质1可以得到:
s ^ [ n ∣ n ] = E [ s [ n ] ∣ X [ n − 1 ] ] + E [ s [ n ] ∣ x ~ [ n ] ] \hat{s}[n|n] = \mathbb{E} \left [ s[n] | \boldsymbol X[n-1] \right] + \mathbb{E} \left [ s[n] | \tilde x[n] \right] s^[nn]=E[s[n]X[n1]]+E[s[n]x~[n]]

其中, E [ s [ n ] ∣ X [ n − 1 ] ] \mathbb{E}[s[n]|\boldsymbol{X}[n-1]] E[s[n]X[n1]]是基于先前观测数据对 s [ n ] s[n] s[n]的预测,令其为 s ^ [ n ∣ n − 1 ] \hat{s}[n|n-1] s^[nn1],根据式(1)和性质2,我们可以进一步得到:
s ^ [ n ∣ n − 1 ] = E [ s [ n ] ∣ X [ n − 1 ] ] = E [ a s [ n − 1 ] + u [ n ] ∣ X [ n − 1 ] ] = a E [ s [ n − 1 ] ∣ X [ n − 1 ] ] = a s ^ [ n − 1 ∣ n − 1 ] \begin{aligned} \hat{s}[n|n-1] &= \mathbb{E} \left [ s[n] | \boldsymbol X[n-1] \right] \\ &= \mathbb{E} \left [ as[n-1] + u[n] | \boldsymbol X[n-1] \right] \\ & = a \mathbb{E} \left [ s[n-1] | \boldsymbol X[n-1] \right] \\ &= a \hat{s}[n-1|n-1] \end{aligned} s^[nn1]=E[s[n]X[n1]]=E[as[n1]+u[n]X[n1]]=aE[s[n1]X[n1]]=as^[n1n1]

因为 E [ u [ n ] ∣ X [ n − 1 ] ] = 0 \mathbb{E} \left [ u[n] | \boldsymbol X[n-1] \right]=0 E[u[n]X[n1]]=0,这是因为
E [ u [ n ] ∣ X [ n − 1 ] ] = E [ u [ n ] ] = 0 \mathbb{E} \left [ u[n] | \boldsymbol X[n-1] \right] = \mathbb{E} [u[n]] = 0 E[u[n]X[n1]]=E[u[n]]=0

这是因为 u [ n ] u[n] u[n]独立于 { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}(该独立性来源于两个方面:首先, u [ n ] u[n] u[n]独立于所有的 w [ n ] w[n] w[n];其次, s [ 0 ] , s [ 1 ] , ⋯   , s [ n − 1 ] s[0],s[1],\cdots,s[n-1] s[0],s[1],,s[n1]是随机变量 { u [ 0 ] , u [ 1 ] , ⋯   , u [ n − 1 ] , s [ − 1 ] } \{u[0],u[1],\cdots,u[n-1],s[-1]\} {u[0],u[1],,u[n1],s[1]}的线性组合,这些随机变量独立于 u [ n ] u[n] u[n])。现在,我们有
s ^ [ n ∣ n ] = s ^ [ n ∣ n − 1 ] + E [ s [ n ] ∣ x ~ [ n ] ] (6) \hat{s}[n|n] = \hat{s}[n|n-1] + \mathbb{E} \left [ s[n]| \tilde x[n] \right] \tag{6} s^[nn]=s^[nn1]+E[s[n]x~[n]](6)

其中,
s ^ [ n ∣ n − 1 ] = a s ^ [ n − 1 ∣ n − 1 ] \hat{s}[n|n-1] = a \hat{s}[n-1|n-1] s^[nn1]=as^[n1n1]

注意到, E [ s [ n ] ∣ x ~ [ n ] ] \mathbb{E} \left [ s[n]| \tilde x[n] \right] E[s[n]x~[n]]是基于 x ~ [ n ] \tilde{x}[n] x~[n] s [ n ] s[n] s[n]的MMSE估计,因此该估计器是线性的, E [ s [ n ] ∣ x ~ [ n ] ] \mathbb{E} \left [ s[n]| \tilde x[n] \right] E[s[n]x~[n]]可以被表征为:
E [ s [ n ] ∣ x ~ [ n ] ] = K [ n ] x ~ [ n ] = K [ n ] ( x [ n ] − x ^ [ n ∣ n − 1 ] ) \begin{aligned} \mathbb{E} \left [ s[n]| \tilde x[n] \right] & = K[n] \tilde x[n]\\ & = K[n] (x[n] - \hat{x}[n|n-1] ) \end{aligned} E[s[n]x~[n]]=K[n]x~[n]=K[n](x[n]x^[nn1])

(因为 s [ n ] s[n] s[n]的均值为0,所以这里没有所谓的“截距”项),其中
K [ n ] = E [ s [ n ] x ~ [ n ] ] E [ x ~ 2 [ n ] ] (7) K[n] = \frac{\mathbb{E} \left [ s[n] \tilde{x}[n] \right]}{\mathbb{E}[\tilde x^2[n]]} \tag{7} K[n]=E[x~2[n]]E[s[n]x~[n]](7)

上式是对 θ , x \theta,x θ,x联合高斯分布的MMSE估计器,即
θ ^ = C θ x C x x − 1 x = E [ θ x ] E [ x ~ 2 [ n ] ] \hat{\theta} = C_{\theta x} C^{-1}_{x x} x = \frac{\mathbb{E}[\theta x]}{\mathbb{E}[\tilde x^2[n]]} θ^=CθxCxx1x=E[x~2[n]]E[θx]

又因为标量观测方程: x [ n ] = s [ n ] + w [ n ] x[n] = s[n] + w[n] x[n]=s[n]+w[n],根据性质2,我们可以得到
x ^ [ n ∣ n − 1 ] = s ^ [ n ∣ n − 1 ] + w ^ [ n ∣ n − 1 ] = s ^ [ n ∣ n − 1 ] \begin{aligned} \hat x[n|n-1] &= \hat s[n|n-1] + \hat w[n|n-1] \\ &= \hat{s}[n|n-1] \end{aligned} x^[nn1]=s^[nn1]+w^[nn1]=s^[nn1]

根据式(6),我们知道
s ^ [ n ∣ n ] = s ^ [ n ∣ n − 1 ] + K [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) (8) \hat{s}[n|n] = \hat{s}[n|n-1] + K[n](x[n] - \hat s[n|n-1]) \tag{8} s^[nn]=s^[nn1]+K[n](x[n]s^[nn1])(8)

其中
s ^ [ n ∣ n − 1 ] = a s ^ [ n − 1 ∣ n − 1 ] (9) \hat{ s}[n|n-1] = a \hat{s}[n-1|n-1] \tag{9} s^[nn1]=as^[n1n1](9)

现在只剩增益因子 K [ n ] K[n] K[n]需要决定,根据式(7),我们知道
K [ n ] = E [ s [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ] E [ ( x [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] (10) K[n] = \frac{\mathbb{E}\left [ s[n] (x[n] - \hat{s}[n|n-1]) \right ]}{\mathbb{E} \left [ (x[n] - \hat{s}[n|n-1])^2 \right]} \tag{10} K[n]=E[(x[n]s^[nn1])2]E[s[n](x[n]s^[nn1])](10)

为了进一步完善 K [ n ] K[n] K[n],我们先给出两个结论:

    1. E [ s [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ] = E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ] \mathbb{E} \left [ s[n] (x[n] - \hat{s}[n|n-1]) \right ] = \mathbb{E} \left[ (s[n] - \hat{s}[n|n-1])(x[n] - \hat{s}[n|n-1]) \right ] E[s[n](x[n]s^[nn1])]=E[(s[n]s^[nn1])(x[n]s^[nn1])]
    1. E [ w [ n ] ( s [ n ] − s ^ [ n ∣ n − 1 ] ) ] = 0 \mathbb{E} \left [ w[n]\left ( s[n] - \hat{s}[n|n-1] \right) \right ] = 0 E[w[n](s[n]s^[nn1])]=0

第一个结论是因为
x ~ [ n ] = x [ n ] − x ^ [ n ∣ n − 1 ] = x [ n ] − s ^ [ n ∣ n − 1 ] (11) \begin{aligned} \tilde x [n] &= x[n] - \hat{x}[n|n-1] \\ &= x[n] - \hat{s}[n|n-1] \tag{11} \end{aligned} x~[n]=x[n]x^[nn1]=x[n]s^[nn1](11)

与之前的观测数据 { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}不相关,必然也就与 s ^ [ n ∣ n − 1 ] \hat{s}[n|n-1] s^[nn1](为 { x [ 0 ] , ⋯   , x [ n − 1 ] } \{x[0],\cdots,x[n-1]\} {x[0],,x[n1]}的线性组合)不相关,因此 E [ s ^ [ n ∣ n − 1 ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ] = 0 \mathbb{E}[\hat{s}[n|n-1](x[n] - \hat{s}[n|n-1])]=0 E[s^[nn1](x[n]s^[nn1])]=0,也就得到了结论1。第二个结论比较直接,这里不做解释。把这两个结论代入到式 ( 10 ) (10) (10)中,增益因子变为:
K [ n ] = E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ] E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] + w [ n ] ) 2 ] = E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] σ n 2 + E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] (12) \begin{aligned} K[n] &= \frac{\mathbb{E} \left[ (s[n] - \hat{s}[n|n-1])(x[n] - \hat{s}[n|n-1]) \right ]}{\mathbb{E} \left [{\left( s[n] - \hat{s}[n|n-1] + w[n] \right)}^2 \right ]} \\ & = \frac{\mathbb{E} \left [ (s[n] - \hat{s}[n|n-1])^2 \right]}{ \sigma^2_n + \mathbb{E} \left [ (s[n] - \hat{s}[n|n-1])^2 \right] } \tag{12} \end{aligned} K[n]=E[(s[n]s^[nn1]+w[n])2]E[(s[n]s^[nn1])(x[n]s^[nn1])]=σn2+E[(s[n]s^[nn1])2]E[(s[n]s^[nn1])2](12)

上式的分子变为平方项是因为 x [ n ] = s [ n ] + w [ n ] x[n] = s[n]+w[n] x[n]=s[n]+w[n],而 w [ n ] w[n] w[n]独立于 s [ n ] s[n] s[n] s ^ [ n ∣ n − 1 ] \hat{s}[n|n-1] s^[nn1]。另外,注意到,分子项 E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] \mathbb{E} \left [ (s[n] - \hat{s}[n|n-1])^2 \right] E[(s[n]s^[nn1])2]就是基于先前观测数据MMSE估计所对应的最小MSE,记为 M [ n ∣ n − 1 ] M[n|n-1] M[nn1],那么
K [ n ] = M [ n ∣ n − 1 ] σ n 2 + M [ n ∣ n − 1 ] (13) K[n] = \frac{M[n|n-1]}{\sigma^2_n + M[n|n-1]} \tag{13} K[n]=σn2+M[nn1]M[nn1](13)

因为 s [ n ] = a s [ n − 1 ] + u [ n ] , s ^ [ n ∣ n − 1 ] = a s ^ [ n − 1 ∣ n − 1 ] s[n]=as[n-1]+u[n], \hat{ s}[n|n-1] = a \hat{s}[n-1|n-1] s[n]=as[n1]+u[n],s^[nn1]=as^[n1n1],我们有
M [ n ∣ n − 1 ] = E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] = E [ ( a s [ n − 1 ] + u [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] = E [ ( a ( s [ n − 1 ] − s ^ [ n − 1 ∣ n − 1 ] ) + u [ n ] ) 2 ] \begin{aligned} M[n|n-1] & = \mathbb{E} \left [ (s[n] - \hat{s}[n|n-1])^2 \right] \\ & = \mathbb{E} \left [ (as[n-1] + u[n] - \hat{s}[n|n-1])^2 \right] \\ & = \mathbb{E} \left [ \left(a(s[n-1] - \hat{s}[n-1|n-1] ) + u[n] \right)^2 \right] \end{aligned} M[nn1]=E[(s[n]s^[nn1])2]=E[(as[n1]+u[n]s^[nn1])2]=E[(a(s[n1]s^[n1n1])+u[n])2]

不难发现,
E [ ( s [ n − 1 ] − s ^ [ n − 1 ∣ n − 1 ] ) u [ n ] ] = 0 \mathbb{E} \left [ \left (s[n-1] - \hat{s}[n-1|n-1] \right) u [n]\right] = 0 E[(s[n1]s^[n1n1])u[n]]=0

因此,我们可以得到
M [ n ∣ n − 1 ] = a 2 M [ n − 1 ∣ n − 1 ] + σ u 2 M[n|n-1] = a^2 M[n-1|n-1] + \sigma^2_u M[nn1]=a2M[n1n1]+σu2

最终,我们需要对 M [ n ∣ n ] M[n|n] M[nn]进行迭代,利用式(8): s ^ [ n ∣ n ] = s ^ [ n ∣ n − 1 ] + K [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) \hat{s}[n|n] = \hat{s}[n|n-1] + K[n](x[n] - \hat s[n|n-1]) s^[nn]=s^[nn1]+K[n](x[n]s^[nn1]),我们有
M [ n ∣ n ] = E [ ( s [ n ] − s ^ [ n ∣ n ] ) 2 ] = E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] − K [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ) 2 ] = E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] − 2 K [ n ] ⋅ E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ]      + K 2 [ n ] ⋅ E [ ( x [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] \begin{aligned} M[n|n] & = \mathbb{E} \left [ (s[n] - \hat{s}[n|n])^2 \right] \\ &= \mathbb{E} \left [ \left ( s[n] - \hat{s}[n|n-1] - K[n](x[n] - \hat s[n|n-1]) \right)^2 \right] \\ & = \mathbb{E} \left [ (s[n] - \hat{s}[n|n-1])^2 \right] - 2 K[n] \cdot \mathbb{E} \left[ (s[n] - \hat{s}[n|n-1])(x[n] - \hat{s}[n|n-1]) \right ] \\ & \ \ \ \ + K^2[n] \cdot \mathbb{E} \left [ (x[n] - \hat{s}[n|n-1])^2 \right] \end{aligned} M[nn]=E[(s[n]s^[nn])2]=E[(s[n]s^[nn1]K[n](x[n]s^[nn1]))2]=E[(s[n]s^[nn1])2]2K[n]E[(s[n]s^[nn1])(x[n]s^[nn1])]    +K2[n]E[(x[n]s^[nn1])2]

注意到,第二项的期望就是式(12)中 K [ n ] K[n] K[n]的分子,最后一项的期望是 K [ n ] K[n] K[n]的分母项,得到
E [ ( s [ n ] − s ^ [ n ∣ n − 1 ] ) ( x [ n ] − s ^ [ n ∣ n − 1 ] ) ] = K [ n ] ( M [ n ∣ n − 1 ] + σ n 2 ) \mathbb{E} \left[ (s[n] - \hat{s}[n|n-1])(x[n] - \hat{s}[n|n-1]) \right ] = K[n](M[n|n-1] + \sigma_n^2) E[(s[n]s^[nn1])(x[n]s^[nn1])]=K[n](M[nn1]+σn2)

E [ ( x [ n ] − s ^ [ n ∣ n − 1 ] ) 2 ] = M [ n ∣ n − 1 ] K [ n ] \mathbb{E} \left [ (x[n] - \hat{s}[n|n-1])^2 \right] = \frac{M[n|n-1]}{K[n]} E[(x[n]s^[nn1])2]=K[n]M[nn1]

因此,
M [ n ∣ n ] = M [ n ∣ n − 1 ] − 2 K 2 [ n ] ( M [ n ∣ n − 1 ] + σ n 2 ) + K [ n ] M [ n ∣ n − 1 ] = M [ n ∣ n − 1 ] − 2 K [ n ] M [ n ∣ n − 1 ] + K [ n ] M [ n ∣ n − 1 ] = ( 1 − K [ n ] ) M [ n ∣ n − 1 ] \begin{aligned} M[n|n] & = M[n|n-1] - 2K^2[n] (M[n|n-1] + \sigma^2_n) + K[n]M[n|n-1] \\ & = M[n|n-1] - 2K[n] M[n|n-1] + K[n] M[n|n-1] \\ & = (1-K[n]) M[n|n-1] \end{aligned} M[nn]=M[nn1]2K2[n](M[nn1]+σn2)+K[n]M[nn1]=M[nn1]2K[n]M[nn1]+K[n]M[nn1]=(1K[n])M[nn1]

至此,我们完成了标量形式Kalman滤波的推导,总结为: ∀ n ≥ 0 \forall n \geq 0 n0
Prediction:
s ^ [ n ∣ n − 1 ] = a s ^ [ n − 1 ∣ n − 1 ] (14) \hat{s}[n|n-1] = a \hat{s} [n-1|n-1] \tag{14} s^[nn1]=as^[n1n1](14)

Minimum Prediction MSE:
M [ n ∣ n − 1 ] = a 2 M [ n − 1 ∣ n − 1 ] + σ u 2 (15) M[n|n-1] = a^2 M[n-1|n-1] + \sigma^2_u \tag{15} M[nn1]=a2M[n1n1]+σu2(15)

Kalman Gain:
K [ n ] = M [ n ∣ n − 1 ] σ n 2 + M [ n ∣ n − 1 ] (16) K[n] = \frac{M[n|n-1]}{\sigma^2_n + M[n|n-1]} \tag{16} K[n]=σn2+M[nn1]M[nn1](16)

Correction:
s ^ [ n ∣ n ] = s ^ [ n ∣ n − 1 ] + K [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) (17) \hat{s}[n|n] = \hat{s}[n|n-1] + K[n] (x[n] - \hat{s}[n|n-1]) \tag{17} s^[nn]=s^[nn1]+K[n](x[n]s^[nn1])(17)

Minimum MSE:
M [ n ∣ n ] = ( 1 − K [ n ] ) M [ n ∣ n − 1 ] (18) M[n|n] = (1-K[n]) M[n|n-1] \tag{18} M[nn]=(1K[n])M[nn1](18)

回顾之前的推导,我们知道均值为0的假设(包括 μ s = 0 , E [ s [ n ] ] = 0 \mu_s=0,\mathbb{E}[s[n]]=0 μs=0,E[s[n]]=0)是为了利用正交性原理,但事实上,即使 μ s ≠ 0 \mu_s \neq 0 μs=0,最终得到的公式与(14-18)式是一致的。在初始化过程中,我们使用 s ^ [ − 1 ∣ − 1 ] = E [ s [ − 1 ] ] = μ s \hat{s}[-1|-1] = \mathbb{E}[s[-1]] = \mu_s s^[11]=E[s[1]]=μs M [ − 1 ∣ − 1 ] = σ s 2 M[-1|-1] = \sigma^2_s M[11]=σs2,因为这是没有观测数据之前所能掌握的数据。另外,我们可以把增益部分的估计视为对 u [ n ] u[n] u[n]的估计 u ^ [ n ] \hat{u}[n] u^[n],公式表征为:
s ^ [ n ∣ n ] = a s ^ [ n − 1 ∣ n − 1 ] + u ^ [ n ] \hat{s}[n|n] = a \hat{s}[n-1|n-1] + \hat{u} [n] s^[nn]=as^[n1n1]+u^[n]

其中 u ^ [ n ] = K [ n ] ( x [ n ] − s ^ [ n ∣ n − 1 ] ) \hat{u}[n] = K[n] (x[n] - \hat{s}[n|n-1]) u^[n]=K[n](x[n]s^[nn1]),某种程度上来说,该估计可以认为是对 u [ n ] u[n] u[n]的估计,所以合理地认为 s ^ [ n ∣ n ] ≈ s [ n ] \hat{s}[n|n] \approx s[n] s^[nn]s[n]

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值