Content
The discrete form of the Wiener filtering problem, is to design a filter to recover a signal
d
(
n
)
d (n)
d(n) from noisy observations
x
(
n
)
=
d
(
n
)
+
v
(
n
)
x(n)=d(n)+v(n)
x(n)=d(n)+v(n)
Assuming that both
d
(
n
)
d(n)
d(n) and
v
(
n
)
v(n)
v(n) are wide-sense stationary random processes, Wiener considered the problem of designing the filter that would produce the minimum mean square error estimate of
d
(
n
)
d(n)
d(n). Thus, with
ξ
=
E
{
∣
e
(
n
)
∣
2
}
=
E
{
∣
d
(
n
)
−
d
^
(
n
)
∣
2
}
\xi=E\left\{|e(n)|^2\right\}=E\left\{|d(n)-\hat d (n)|^2\right\}
ξ=E{∣e(n)∣2}=E{∣d(n)−d^(n)∣2}
the problem is to find the filter that minimizes
ξ
\xi
ξ.
Depending upon how the signals x ( n ) x(n) x(n) and d ( n ) d(n) d(n) are related to each other, a number of different and important problems may be cast into a Wiener filtering framework. Some of the problems that will be considered in this chapter include:
- Filtering: estimate d ( n ) d(n) d(n) from x ( n ) = d ( n ) + v ( n ) x(n)=d(n)+v(n) x(n)=d(n)+v(n).
- Prediction: estimate x ( n + α ) x(n+\alpha) x(n+α) from x ( n ) , x ( n − 1 ) , x ( n − 2 ) , … x(n), x(n-1), x(n-2), \ldots x(n),x(n−1),x(n−2),…
- Deconvolution: estimate d ( n ) d(n) d(n) from x ( n ) = g ( n ) ∗ d ( n ) + v ( n ) x(n)=g(n) * d(n)+v(n) x(n)=g(n)∗d(n)+v(n)
- Noise cancellation: estimate v 1 ( n ) v_{1}(n) v1(n) from v 2 ( n ) v_{2}(n) v2(n) and subtract it from x ( n ) = d ( n ) + v 1 ( n ) x(n)=d(n)+v_{1}(n) x(n)=d(n)+v1(n)
The FIR Wiener Filter
In this section we consider the design of an FIR Wiener filter that produces the minimum mean-square estimate of a given process d ( n ) d(n) d(n) by filtering a set of observations of a statistically related process x ( n ) x(n) x(n).
Note that we didn’t introduce any assumption about x ( n ) x(n) x(n), it can be any process statistically related to d ( n ) d(n) d(n).
It is assumed that
x
(
n
)
x (n)
x(n) and
d
(
n
)
d(n)
d(n) are jointly wide-sense stationary with known autocorrelations,
r
x
(
k
)
r_x(k)
rx(k) and
r
d
(
k
)
r_d(k)
rd(k), and known cross-correlation
r
d
x
(
k
)
r_{dx}(k)
rdx(k). Denoting the unit sample response of the Wiener filter by
w
(
n
)
w(n)
w(n), and assuming a
(
p
−
l
)
(p - l)
(p−l)st-order filter, the system function is
W
(
z
)
=
∑
n
=
0
p
−
1
w
(
n
)
z
−
n
W(z)=\sum_{n=0}^{p-1} w(n)z^{-n}
W(z)=n=0∑p−1w(n)z−n
With
x
(
n
)
x(n)
x(n) the input to the filter, the output, which we denote by
d
^
(
n
)
\hat d (n)
d^(n), is the convolution of
w
(
n
)
w(n)
w(n) with
x
(
n
)
x(n)
x(n),
d
^
(
n
)
=
∑
l
=
0
p
−
1
w
(
l
)
x
(
n
−
l
)
(FWF.1)
\hat d(n)=\sum_{l=0}^{p-1}w(l)x(n-l) \tag{FWF.1}
d^(n)=l=0∑p−1w(l)x(n−l)(FWF.1)
The Wiener filter design problem requires that we find the filter coefficients,
w
(
k
)
w(k)
w(k), that minimize the mean-square error
ξ
=
E
{
∣
e
(
n
)
∣
2
}
=
E
{
∣
d
(
n
)
−
d
^
(
n
)
∣
2
}
(FWF.2)
\xi=E\left\{|e(n)|^2\right\}=E\left\{|d(n)-\hat d (n)|^2\right\} \tag{FWF.2}
ξ=E{∣e(n)∣2}=E{∣d(n)−d^(n)∣2}(FWF.2)
In order for a set of filter coefficients to minimize
ξ
\xi
ξ it is necessary and sufficient that the derivative of
ξ
\xi
ξ with respect to
w
∗
(
k
)
w^* (k)
w∗(k) be equal to zero for
k
=
0
,
1
,
⋯
,
p
−
1
k=0,1,\cdots,p-1
k=0,1,⋯,p−1,
∂
ξ
∂
w
∗
(
k
)
=
∂
∂
w
∗
(
k
)
E
{
e
(
n
)
e
∗
(
n
)
}
=
E
{
e
(
n
)
∂
e
∗
(
n
)
∂
w
∗
(
k
)
}
=
0
(FWF.3)
\frac{\partial \xi}{\partial w^*(k)}=\frac{\partial}{\partial w^*(k)}E\{e(n)e^*(n )\}=E\left\{e(n)\frac{\partial e^*(n)}{\partial w^*(k)}\right\}=0 \tag{FWF.3}
∂w∗(k)∂ξ=∂w∗(k)∂E{e(n)e∗(n)}=E{e(n)∂w∗(k)∂e∗(n)}=0(FWF.3)
With
e
(
n
)
=
d
(
n
)
−
∑
l
=
0
p
−
1
w
(
l
)
x
(
n
−
l
)
(FWF.4)
e(n)=d(n)-\sum_{l=0}^{p-1} w(l)x(n-l)\tag{FWF.4}
e(n)=d(n)−l=0∑p−1w(l)x(n−l)(FWF.4)
it follows that
∂
e
∗
(
n
)
∂
w
∗
(
k
)
=
−
x
∗
(
n
−
k
)
\frac{\partial e^*(n)}{\partial w^*(k)}=-x^*(n-k)
∂w∗(k)∂e∗(n)=−x∗(n−k)
and
(
F
W
F
.
3
)
(FWF.3)
(FWF.3) becomes
E
{
e
(
n
)
x
∗
(
n
−
k
)
}
=
0
;
k
=
0
,
1
,
⋯
,
p
−
1
(FWF.5)
E\{e(n)x^*(n-k) \}=0;\quad k=0,1,\cdots,p-1\tag{FWF.5}
E{e(n)x∗(n−k)}=0;k=0,1,⋯,p−1(FWF.5)
which is known as the orthogonality principle or the projection theorem.
Substituting
(
F
W
F
.
4
)
(FWF.4)
(FWF.4) into
(
F
W
F
.
5
)
(FWF.5)
(FWF.5) we have
E
{
d
(
n
)
x
∗
(
n
−
k
)
}
−
∑
l
=
0
p
−
1
w
(
l
)
E
{
x
(
n
−
l
)
x
∗
(
n
−
k
)
}
=
0
(FWF.6)
E\{d(n)x^*(n-k)\}-\sum_{l=0}^{p-1}w(l)E\{x(n-l)x^*(n-k)\}=0\tag{FWF.6}
E{d(n)x∗(n−k)}−l=0∑p−1w(l)E{x(n−l)x∗(n−k)}=0(FWF.6)
Finally, since
x
(
n
)
x(n)
x(n) and
d
(
n
)
d(n)
d(n) are jointly WSS then
∑
l
=
0
p
−
1
w
(
l
)
r
x
(
k
−
l
)
=
r
d
x
(
k
)
;
k
=
0
,
1
,
⋯
,
p
−
1
(FWF.7)
\sum_{l=0}^{p-1} w(l)r_x(k-l)=r_{dx}(k);\quad k=0,1,\cdots,p-1 \tag{FWF.7}
l=0∑p−1w(l)rx(k−l)=rdx(k);k=0,1,⋯,p−1(FWF.7)
In matrix form, using the fact that the autocorrelation sequence is conjugate symmetric,
r
x
(
k
)
=
r
x
∗
(
k
)
r_x(k)=r_x^*(k)
rx(k)=rx∗(k),
(
F
W
F
.
7
)
(FWF.7)
(FWF.7) becomes
[
r
x
(
0
)
r
x
∗
(
1
)
⋯
r
x
∗
(
p
−
1
)
r
x
(
1
)
r
x
(
0
)
⋯
r
x
∗
(
p
−
2
)
r
x
(
2
)
r
x
(
1
)
⋯
r
x
∗
(
p
−
3
)
⋮
⋮
⋮
r
x
(
p
−
1
)
r
x
(
p
−
2
)
⋯
r
x
(
0
)
]
[
w
(
0
)
w
(
1
)
w
(
2
)
⋮
w
(
p
−
1
)
]
=
[
r
d
x
(
0
)
r
d
x
(
1
)
r
d
x
(
2
)
⋮
r
d
x
(
p
−
1
)
]
(FWF.8)
\left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ r_{x}(2) & r_{x}(1) & \cdots & r_{x}^{*}(p-3) \\ \vdots & \vdots & & \vdots \\ r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} w(0) \\ w(1) \\ w(2) \\ \vdots \\ w(p-1) \end{array}\right]=\left[\begin{array}{c} r_{d x}(0) \\ r_{d x}(1) \\ r_{d x}(2) \\ \vdots \\ r_{d x}(p-1) \end{array}\right]\tag{FWF.8}
⎣⎢⎢⎢⎢⎢⎡rx(0)rx(1)rx(2)⋮rx(p−1)rx∗(1)rx(0)rx(1)⋮rx(p−2)⋯⋯⋯⋯rx∗(p−1)rx∗(p−2)rx∗(p−3)⋮rx(0)⎦⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡w(0)w(1)w(2)⋮w(p−1)⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎡rdx(0)rdx(1)rdx(2)⋮rdx(p−1)⎦⎥⎥⎥⎥⎥⎤(FWF.8)
which is the matrix form of Wiener-Hopf equations. It may be written more concisely as
R
x
w
=
r
d
x
(FWF.9)
\mathbf R_x \mathbf w=\mathbf r_{dx}\tag{FWF.9}
Rxw=rdx(FWF.9)
where
R
x
=
E
{
x
∗
x
T
}
,
x
=
[
x
(
n
)
,
⋯
,
x
(
n
−
p
+
1
)
]
\mathbf R_x=E\{\mathbf x^* \mathbf x^T\},\mathbf x=[x(n),\cdots,x(n-p+1)]
Rx=E{x∗xT},x=[x(n),⋯,x(n−p+1)].
The minimum mean-square error in the estimate of
d
(
n
)
d(n)
d(n) may be evaluated from
(
F
W
F
.
2
)
(FWF.2)
(FWF.2) as follows. With
ξ
m
i
n
=
E
{
∣
e
(
n
)
∣
2
}
=
E
{
e
(
n
)
[
d
(
n
)
−
∑
l
=
0
p
−
1
w
(
l
)
x
(
n
−
l
)
]
∗
}
=
a
E
{
e
(
n
)
d
∗
(
n
)
}
=
E
{
[
d
(
n
)
−
∑
l
=
0
p
−
1
w
(
l
)
x
(
n
−
l
)
]
d
∗
(
n
)
}
=
r
d
(
0
)
−
∑
l
=
0
p
−
1
w
(
l
)
r
d
x
∗
(
l
)
(FWF.10)
\begin{aligned} \xi_{\mathrm{min}}&=E\left\{|e(n)|^2\right\}=E\left\{e(n)\left[d(n)-\sum_{l=0}^{p-1} w(l)x(n-l) \right]^*\right\}\\ &\stackrel{a}{=}E\{e(n)d^*(n)\}=E\left\{\left[d(n)-\sum_{l=0}^{p-1} w(l)x(n-l) \right]d^*(n)\right\}\\ &=r_d(0)-\sum_{l=0}^{p-1}w(l)r^*_{dx}(l) \end{aligned} \tag{FWF.10}
ξmin=E{∣e(n)∣2}=E⎩⎨⎧e(n)[d(n)−l=0∑p−1w(l)x(n−l)]∗⎭⎬⎫=aE{e(n)d∗(n)}=E{[d(n)−l=0∑p−1w(l)x(n−l)]d∗(n)}=rd(0)−l=0∑p−1w(l)rdx∗(l)(FWF.10)
where
=
a
\stackrel{a}{=}
=a is due to
(
F
W
F
.
5
)
(FWF.5)
(FWF.5), i.e., orthogonality theorem. Or using vector notation
ξ
m
i
n
=
r
d
(
0
)
−
r
d
x
H
w
=
r
d
(
0
)
−
r
d
x
H
R
x
−
1
r
d
x
(FWF.11)
\xi_{\mathrm{min}}=r_d(0)-\mathbf r_{dx}^H \mathbf w=r_d(0)-\mathbf r_{dx}^H \mathbf R_x^{-1} \mathbf r_{dx}\tag{FWF.11}
ξmin=rd(0)−rdxHw=rd(0)−rdxHRx−1rdx(FWF.11)
Filtering
In the filtering problem, a signal
d
(
n
)
d(n)
d(n) is to be estimated from a noise corrupted observation
x
(
n
)
=
d
(
n
)
+
v
(
n
)
or
x
=
d
+
v
x(n)=d(n)+v(n) \text{ or } \mathbf x=\mathbf d+\mathbf v
x(n)=d(n)+v(n) or x=d+v
where
x
=
[
x
(
n
)
,
⋯
,
x
(
n
−
p
+
1
)
]
\mathbf x=[x(n),\cdots,x(n-p+1)]
x=[x(n),⋯,x(n−p+1)],
d
=
[
d
(
n
)
,
⋯
,
d
(
n
−
p
+
1
)
]
\mathbf d=[d(n),\cdots,d(n-p+1)]
d=[d(n),⋯,d(n−p+1)],
v
=
[
v
(
n
)
,
⋯
,
v
(
n
−
p
+
1
)
]
\mathbf v=[v(n),\cdots,v(n-p+1)]
v=[v(n),⋯,v(n−p+1)].
It will be assumed that the noise
v
(
n
)
v(n)
v(n) has zero mean and that it is uncorrelated with
d
(
n
)
d(n)
d(n). Therefore,
E
{
d
(
n
)
v
∗
(
n
−
k
)
}
=
0
E\{d(n)v^*(n - k)\} = 0
E{d(n)v∗(n−k)}=0 and
R
x
\mathbf R_x
Rx and
r
d
x
\mathbf r_{dx}
rdx becomes
R
x
=
E
{
x
∗
x
T
}
=
E
{
(
d
+
v
)
∗
(
d
+
v
)
T
}
=
E
{
d
∗
d
T
}
+
E
{
v
∗
v
T
}
=
R
d
+
R
v
r
d
x
=
E
{
d
(
n
)
x
∗
}
=
E
{
d
(
n
)
d
∗
}
=
r
d
\begin{aligned} \mathbf R_x&=E\{\mathbf x^* \mathbf x^T\}=E\{(\mathbf d+\mathbf v)^* (\mathbf d+\mathbf v)^T\}=E\{\mathbf d^* \mathbf d^T\}+E\{\mathbf v^* \mathbf v^T\}=\mathbf R_d+\mathbf R_v\\ \mathbf r_{dx}&=E\{d(n)\mathbf x^*\}=E\{d(n)\mathbf d^*\}=\mathbf r_d \end{aligned}
Rxrdx=E{x∗xT}=E{(d+v)∗(d+v)T}=E{d∗dT}+E{v∗vT}=Rd+Rv=E{d(n)x∗}=E{d(n)d∗}=rd
The Wiener-Hopf equations then become
(
R
d
+
R
v
)
w
=
r
d
(FWF.12)
(\mathbf R_d+\mathbf R_v)\mathbf w=\mathbf r_d \tag{FWF.12}
(Rd+Rv)w=rd(FWF.12)
Prediction
Noise-free observations
We consider the following data model (
α
\alpha
α-step prediction)
d
(
n
)
=
x
(
n
+
α
)
x
^
(
n
+
α
)
=
∑
k
=
0
p
−
1
w
(
k
)
x
(
n
−
k
)
d(n)=x(n+\alpha)\\ \hat x(n+\alpha)=\sum_{k=0}^{p-1}w(k)x(n-k)
d(n)=x(n+α)x^(n+α)=k=0∑p−1w(k)x(n−k)
This results in the following expression for
r
d
x
\mathbf r_{dx}
rdx:
r
d
x
=
E
{
d
(
n
)
x
∗
}
=
E
{
x
(
n
+
α
)
x
∗
}
≜
r
α
\mathbf r_{dx}=E\{d(n)\mathbf x^*\}=E\{x(n+\alpha)\mathbf x^*\}\triangleq \mathbf r_\alpha
rdx=E{d(n)x∗}=E{x(n+α)x∗}≜rα
The Wiener-Hopf equations then become
[
r
x
(
0
)
r
x
∗
(
1
)
⋯
r
x
∗
(
p
−
1
)
r
x
(
1
)
r
x
(
0
)
⋯
r
x
∗
(
p
−
2
)
⋮
⋮
⋱
⋮
r
x
(
p
−
1
)
r
x
(
p
−
2
)
⋯
r
x
(
0
)
]
⏟
R
x
[
w
(
0
)
w
(
1
)
⋮
w
(
p
−
1
)
]
⏟
w
=
[
r
x
(
α
)
r
x
(
α
+
1
)
⋮
r
x
(
α
+
p
−
1
)
]
⏟
r
α
(FWF.13)
\underbrace{\left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \vdots & \ddots & \vdots \\ r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]}_{\mathbf{R}_{x}} \underbrace{\left[\begin{array}{c} w(0) \\ w(1) \\ \vdots \\ w(p-1) \end{array}\right]}_{\mathbf{w}}= \underbrace{\left[\begin{array}{c} r_{x}(\alpha) \\ r_{x}(\alpha+1) \\ \vdots \\ r_{x}(\alpha+p-1) \end{array}\right]}_{\mathbf{r}_{\alpha}}\tag{FWF.13}
Rx
⎣⎢⎢⎢⎡rx(0)rx(1)⋮rx(p−1)rx∗(1)rx(0)⋮rx(p−2)⋯⋯⋱⋯rx∗(p−1)rx∗(p−2)⋮rx(0)⎦⎥⎥⎥⎤w
⎣⎢⎢⎢⎡w(0)w(1)⋮w(p−1)⎦⎥⎥⎥⎤=rα
⎣⎢⎢⎢⎡rx(α)rx(α+1)⋮rx(α+p−1)⎦⎥⎥⎥⎤(FWF.13)
For
α
=
1
\alpha=1
α=1, this is similar to the all-pole modeling using Prony’s method, the autocorrelation method, or the Yule-Walker method (the minus sign simply changes the sign of the coefficients).
Observations with noise
We consider the following data model
(
α
(\alpha
(α -step prediction)
y
(
n
)
=
x
(
n
)
+
v
(
n
)
or
y
=
x
+
v
and
d
(
n
)
=
x
(
n
+
α
)
y(n)=x(n)+v(n) \text { or } \mathbf{y}=\mathbf{x}+\mathbf{v} \text { and } d(n)=x(n+\alpha)
y(n)=x(n)+v(n) or y=x+v and d(n)=x(n+α)
where
x
=
[
x
(
n
)
,
⋯
,
x
(
n
−
p
+
1
)
]
\mathbf x=[x(n),\cdots,x(n-p+1)]
x=[x(n),⋯,x(n−p+1)],
y
=
[
y
(
n
)
,
…
,
y
(
n
−
p
+
1
)
]
T
\mathbf{y}=[y(n), \ldots, y(n-p+1)]^{T}
y=[y(n),…,y(n−p+1)]T and
v
=
[
v
(
n
)
,
…
,
v
(
n
−
p
+
1
)
]
T
\mathbf{v}=[v(n), \ldots, v(n-p+1)]^{T}
v=[v(n),…,v(n−p+1)]T
This results in the following expressions for
R
y
\mathbf{R}_{y}
Ry and
r
d
y
\mathbf{r}_{d y}
rdy
R
y
=
E
{
y
∗
y
T
}
=
E
{
(
x
+
v
)
∗
(
x
+
v
)
T
}
=
E
{
x
∗
x
T
}
+
E
{
v
∗
v
T
}
=
R
x
+
R
v
r
d
y
=
E
{
d
(
n
)
y
∗
}
=
E
{
x
(
n
+
α
)
(
x
∗
+
v
∗
)
}
=
E
{
x
(
n
+
α
)
x
∗
)
=
r
α
\begin{array}{l} \mathbf{R}_{y}=E\left\{\mathbf{y}^* \mathbf{y}^{T}\right\}=E\left\{(\mathbf{x}+\mathbf{v})^* (\mathbf{x}+\mathbf{v})^{T}\right\}=E\left\{\mathbf{x}^* \mathbf{x}^{T}\right\}+E\left\{\mathbf{v}^*\mathbf{v}^{T}\right\}=\mathbf{R}_{x}+\mathbf{R}_{v} \\ \mathbf{r}_{d y}=E\left\{d(n) \mathbf{y}^{*}\right\}=E\left\{x(n+\alpha)\left(\mathbf{x}^{*}+\mathbf{v}^{*}\right)\right\}=E\left\{x(n+\alpha) \mathbf{x}^{*}\right)=\mathbf{r}_{\alpha} \end{array}
Ry=E{y∗yT}=E{(x+v)∗(x+v)T}=E{x∗xT}+E{v∗vT}=Rx+Rvrdy=E{d(n)y∗}=E{x(n+α)(x∗+v∗)}=E{x(n+α)x∗)=rα
The Wiener-Hopf equations then become
(
R
x
+
R
v
)
w
=
r
α
(FWF.14)
\left(\mathbf{R}_{x}+\mathbf{R}_{v}\right) \mathbf{w}=\mathbf{r}_{\alpha}\tag{FWF.14}
(Rx+Rv)w=rα(FWF.14)
Deconvolution
We consider a noisy convolutive model, with an FIR filter
g
(
n
)
g(n)
g(n) of order
L
:
L:
L:
x
(
n
)
=
g
(
n
)
∗
d
(
n
)
+
v
(
n
)
or
x
=
G
d
+
v
x(n)=g(n) * d(n)+v(n) \text { or } \mathbf{x}=\mathbf{G} \mathbf{d}+\mathbf{v}
x(n)=g(n)∗d(n)+v(n) or x=Gd+v
where
d
(
p
+
L
)
×
1
=
[
d
(
n
)
,
…
,
d
(
n
−
p
+
1
)
,
…
,
d
(
n
−
L
−
p
+
1
)
]
T
,
v
p
×
1
=
[
v
(
n
)
,
…
,
v
(
n
−
p
+
1
)
]
T
\mathbf{d}_{(p+L)\times 1}=[d(n), \ldots, d(n-p+1), \ldots, d(n-L-p+1)]^{T}, \mathbf{v}_{p\times 1}=[v(n), \ldots, v(n-p+1)]^{T}
d(p+L)×1=[d(n),…,d(n−p+1),…,d(n−L−p+1)]T,vp×1=[v(n),…,v(n−p+1)]T and
G
p
×
(
p
+
L
)
=
[
g
(
0
)
⋯
g
(
L
)
⋯
0
⋮
⋱
⋱
⋱
⋮
0
⋯
g
(
0
)
⋯
g
(
L
)
]
.
\mathrm{G}_{p\times (p+L)}=\left[\begin{array}{ccccc} g(0) & \cdots & g(L) & \cdots & 0 \\ \vdots & \ddots & \ddots & \ddots & \vdots \\ 0 & \cdots & g(0) & \cdots & g(L) \end{array}\right] .
Gp×(p+L)=⎣⎢⎡g(0)⋮0⋯⋱⋯g(L)⋱g(0)⋯⋱⋯0⋮g(L)⎦⎥⎤.
This results in the following expressions for
R
x
\mathbf{R}_{x}
Rx and
r
d
x
\mathrm{r}_{d x}
rdx :
R
x
=
E
{
x
∗
x
T
}
=
G
∗
E
{
d
∗
d
T
}
G
T
+
E
{
v
∗
v
T
}
=
G
∗
R
d
G
T
+
R
v
r
d
x
=
E
{
d
(
n
)
x
∗
}
=
G
∗
E
{
d
(
n
)
d
∗
}
=
G
∗
r
d
\begin{array}{l} \mathbf{R}_{x}=E\left\{\mathbf{x}^{*} \mathbf{x}^{T}\right\}=\mathbf{G}^{*} E\left\{\mathbf{d}^{*} \mathbf{d}^{T}\right\} \mathbf{G}^{T}+E\left\{\mathbf{v}^{*} \mathbf{v}^{T}\right\}=\mathbf{G}^{*} \mathbf{R}_{d} \mathbf{G}^{T}+\mathbf{R}_{v} \\ \mathbf{r}_{d x}=E\left\{d(n) \mathbf{x}^{*}\right\}=\mathbf{G}^{*} E\left\{d(n) \mathbf{d}^{*}\right\}=\mathbf{G}^{*} \mathbf{r}_{d} \end{array}
Rx=E{x∗xT}=G∗E{d∗dT}GT+E{v∗vT}=G∗RdGT+Rvrdx=E{d(n)x∗}=G∗E{d(n)d∗}=G∗rd
The Wiener-Hopf equations then become
(
G
∗
R
d
G
T
+
R
v
)
w
=
G
∗
r
d
(FWF.15)
\left(\mathbf{G}^{*} \mathbf{R}_{d} \mathbf{G}^{T}+\mathbf{R}_{v}\right) \mathbf{w}=\mathbf{G}^{*} \mathbf{r}_{d}\tag{FWF.15}
(G∗RdGT+Rv)w=G∗rd(FWF.15)
Noise cancellation
We consider the same data model as for filtering:
x
(
n
)
=
d
(
n
)
+
v
1
(
n
)
or
x
=
d
+
v
1
x(n)=d(n)+v_{1}(n) \text { or } \mathbf{x}=\mathbf{d}+\mathbf{v}_{1}
x(n)=d(n)+v1(n) or x=d+v1
where
d
=
[
d
(
n
)
,
…
,
d
(
n
−
p
+
1
)
]
T
\mathrm{d}=[d(n), \ldots, d(n-p+1)]^{T}
d=[d(n),…,d(n−p+1)]T and
v
1
=
[
v
1
(
n
)
,
…
,
v
1
(
n
−
p
+
1
)
]
T
\mathrm{v}_{1}=\left[v_{1}(n), \ldots, v_{1}(n-p+1)\right]^{T}
v1=[v1(n),…,v1(n−p+1)]T
This time we estimate
v
1
(
n
)
v_{1}(n)
v1(n) from a correlated noise source
v
2
(
n
)
,
v_{2}(n),
v2(n), and estimate
d
(
n
)
d(n)
d(n) as
d
^
(
n
)
=
x
(
n
)
−
v
^
1
(
n
)
with
v
^
1
(
n
)
=
w
T
v
2
\hat{d}(n)=x(n)-\hat{v}_{1}(n) \text { with } \hat{v}_{1}(n)=\mathbf{w}^{T} \mathbf{v}_{2}
d^(n)=x(n)−v^1(n) with v^1(n)=wTv2
where
v
2
=
[
v
2
(
n
)
,
…
,
v
2
(
n
−
p
+
1
)
]
T
\mathbf{v}_{2}=\left[v_{2}(n), \ldots, v_{2}(n-p+1)\right]^{T}
v2=[v2(n),…,v2(n−p+1)]T
To estimate
v
1
(
n
)
v_{1}(n)
v1(n) from
v
2
(
n
)
,
v_{2}(n),
v2(n), we start from the Wiener-Hopf equations
R
v
2
w
=
r
v
1
v
2
\mathbf{R}_{v_{2}} \mathbf{w}=\mathbf{r}_{v_{1} v_{2}}
Rv2w=rv1v2
since
r
v
1
v
2
\mathbf{r}_{v 1 v 2}
rv1v2 is generally not known, we can rewrite this as
r
v
1
v
2
=
E
{
v
1
(
n
)
v
2
∗
}
=
E
{
(
d
(
n
)
+
v
1
(
n
)
)
v
2
∗
}
=
E
{
x
(
n
)
v
2
∗
}
=
r
x
v
2
\mathbf{r}_{v_{1} v_{2}}=E\left\{v_{1}(n) \mathbf{v}_{2}^{*}\right\}=E\left\{\left(d(n)+v_{1}(n)\right) \mathbf{v}_{2}^{*}\right\}=E\left\{x(n) \mathbf{v}_{2}^{*}\right\}=\mathbf{r}_{x v_{2}}
rv1v2=E{v1(n)v2∗}=E{(d(n)+v1(n))v2∗}=E{x(n)v2∗}=rxv2
and thus the Wiener-Hopf equations can be written as
R
v
2
w
=
r
x
v
2
(FWF.16)
\mathbf{R}_{v_{2}} \mathbf{w}=\mathbf{r}_{x v_{2}}\tag{FWF.16}
Rv2w=rxv2(FWF.16)
As already mentioned,
d
(
n
)
d(n)
d(n) is then estimated as
d
^
(
n
)
=
x
(
n
)
−
v
^
1
(
n
)
with
v
^
1
(
n
)
=
w
T
v
2
\hat{d}(n)=x(n)-\hat{v}_{1}(n) \text { with } \hat{v}_{1}(n)=\mathbf{w}^{T} \mathbf{v}_{2}
d^(n)=x(n)−v^1(n) with v^1(n)=wTv2
Discrete Kalman Filter
In section [The FIR Wiener Filter](# The FIR Wiener Filter) we considered the problem of designing a casual Wiener filter to estimate a process d ( n ) d(n) d(n) from a set of noisy observations x ( n ) = d ( n ) + v ( n ) x(n)=d(n)+v(n) x(n)=d(n)+v(n). The primary limitation with the solution that was derived is that it requires that d ( n ) d(n) d(n) and x ( n ) x(n) x(n) be jointly wide-sense stationary processes. Since most processes encountered in practice are nonstationary, this constraint limits the usefulness of the Wiener filter. Therefore, in this section we re-examine this estimation problem within the context of nonstationary processes and derive what is known as the discrete Kalman filter.
Consider the following nonstationary state space model:
x
(
n
)
=
A
(
n
−
1
)
x
(
n
−
1
)
+
w
(
n
)
y
(
n
)
=
C
(
n
)
x
(
n
)
+
v
(
n
)
(DKF.1)
\begin{array}{l} \mathbf{x}(n)=\mathbf{A}(n-1) \mathbf{x}(n-1)+\mathbf{w}(n) \\ \mathbf{y}(n)=\mathbf{C}(n) \mathbf{x}(n)+\mathbf{v}(n) \end{array}\tag{DKF.1}
x(n)=A(n−1)x(n−1)+w(n)y(n)=C(n)x(n)+v(n)(DKF.1)
x
(
n
)
\mathbf{x}(n)
x(n): the
p
×
1
p \times 1
p×1 state vector
A ( n − 1 ) \mathbf{A}(n-1) A(n−1): the p × p p \times p p×p state transition matrix
w ( n ) \mathbf{w}(n) w(n): the state noise with E { w ( n ) w H ( n ) } = Q w ( n ) δ ( n − k ) E\left\{\mathbf{w}(n) \mathbf{w}^{H}(n)\right\}=\mathbf{Q}_{w}(n) \delta(n-k) E{w(n)wH(n)}=Qw(n)δ(n−k)
y ( n ) \mathbf{y}(n) y(n): the q × 1 q \times 1 q×1 observation vector
C ( n ) \mathbf{C}(n) C(n): the q × p q \times p q×p observation matrix
v ( n ) \mathbf{v}(n) v(n): the observation noise with E { v ( n ) v H ( k ) } = Q v ( n ) δ ( n − k ) E\left\{\mathbf{v}(n) \mathbf{v}^{H}(k)\right\}=\mathbf{Q}_{v}(n) \delta(n-k) E{v(n)vH(k)}=Qv(n)δ(n−k), independent of the observation noise
It is assumed that A ( n ) , C ( n ) , Q w ( n ) , \mathbf A(n), \mathbf C(n), \mathbf Q_w(n), A(n),C(n),Qw(n), and Q v ( n ) \mathbf Q_v(n) Qv(n) are known.
We are going to show that the optimum linear estimate of
x
(
n
)
\mathbf x(n)
x(n) to be expressed in the form
x
^
(
n
)
=
A
(
n
−
1
)
x
^
(
n
−
1
)
+
K
(
n
)
[
y
(
n
)
−
C
(
n
)
A
(
n
−
1
)
x
^
(
n
−
1
)
]
(DKF.2)
\hat {\mathbf x}(n)=\mathbf A(n-1)\hat{\mathbf x}(n-1)+\mathbf K(n)\big[\mathbf y(n)-\mathbf C(n)\mathbf A(n-1)\hat {\mathbf x}(n-1)\big]\tag{DKF.2}
x^(n)=A(n−1)x^(n−1)+K(n)[y(n)−C(n)A(n−1)x^(n−1)](DKF.2)
With the appropriate Kalman gain matrix
K
(
n
)
\mathbf K(n)
K(n), this recursion corresponds to the discrete Kalman filter.
Let us define
x
^
(
n
∣
n
−
1
)
\hat{\mathbf{x}}(n | n-1)
x^(n∣n−1) and
x
^
(
n
∣
n
)
\hat{\mathbf{x}}(n | n)
x^(n∣n) as the best linear estimate of
x
(
n
)
\mathbf{x}(n)
x(n) given the observations
y
(
n
)
\mathbf{y}(n)
y(n) up to time
n
−
1
n-1
n−1 and
n
,
n,
n, respectively.
Let us denote the corresponding errors as
e
(
n
∣
n
−
1
)
=
x
(
n
)
−
x
^
(
n
∣
n
−
1
)
e
(
n
∣
n
)
=
x
(
n
)
−
x
^
(
n
∣
n
)
(DKF.3)
\begin{aligned} \mathbf{e}(n | n-1) &=\mathbf{x}(n)-\hat{\mathbf{x}}(n | n-1) \\ \mathbf{e}(n | n) &=\mathbf{x}(n)-\hat{\mathbf{x}}(n | n) \end{aligned}\tag{DKF.3}
e(n∣n−1)e(n∣n)=x(n)−x^(n∣n−1)=x(n)−x^(n∣n)(DKF.3)
with covariance matrices
P
(
n
∣
n
−
1
)
=
E
{
e
(
n
∣
n
−
1
)
e
H
(
n
∣
n
−
1
)
}
P
(
n
∣
n
)
=
E
{
e
(
n
∣
n
)
e
H
(
n
∣
n
)
}
(DKF.4)
\begin{aligned} \mathbf{P}(n| n-1)&=E\left\{\mathbf{e}(n| n-1) \mathbf{e}^{H}(n | n-1)\right\}\\ \mathbf{P}(n | n)&=E\left\{\mathbf{e}(n | n) \mathbf{e}^{H}(n | n)\right\} \end{aligned}\tag{DKF.4}
P(n∣n−1)P(n∣n)=E{e(n∣n−1)eH(n∣n−1)}=E{e(n∣n)eH(n∣n)}(DKF.4)
For each
n
>
0
n >0
n>0, given
x
^
(
n
−
1
∣
n
−
1
)
\hat {\mathbf x}(n-1|n-1)
x^(n−1∣n−1) and
P
(
n
−
1
∣
n
−
1
)
\mathbf P(n-1|n-1)
P(n−1∣n−1), when a new observation,
y
(
n
)
y(n)
y(n), becomes available, the problem is to find the minimum mean-square estimate
x
^
(
n
∣
n
)
\hat {\mathbf x} (n|n)
x^(n∣n) for the state vector
x
(
n
)
\mathbf x(n)
x(n), i.e., to solve
min
x
^
(
n
∣
n
)
t
r
(
P
(
n
∣
n
)
)
(DKF.5)
\min_{\hat {\mathbf x}(n|n)} \mathrm{tr}(\mathbf P(n|n))\tag{DKF.5}
x^(n∣n)mintr(P(n∣n))(DKF.5)
The solution to this problem will be derived in two steps.
- Given x ^ ( n − 1 ∣ n − 1 ) \hat{\mathbf x}(n-1|n-1) x^(n−1∣n−1) we will find x ^ ( n ∣ n − 1 ) \hat {\mathbf x}(n|n-1) x^(n∣n−1), which is the best estimate of x ( n ) \mathbf x(n) x(n) without the observation y ( n ) \mathbf y(n) y(n).
- Given y ( n ) \mathbf y(n) y(n) and x ^ ( n ∣ n − 1 ) \hat{\mathbf x}(n|n - 1) x^(n∣n−1) we will estimate x ^ ( n ∣ n ) \hat {\mathbf x}(n|n) x^(n∣n).
In the first step, since no new measurements are used to estimate
x
(
n
)
\mathbf x(n)
x(n), all that is known is that
x
(
n
)
\mathbf x(n)
x(n) evolves according to the state equation
x
(
n
)
=
A
(
n
−
1
)
x
(
n
−
1
)
+
w
(
n
)
\mathbf{x}(n)=\mathbf{A}(n-1) \mathbf{x}(n-1)+\mathbf{w}(n)
x(n)=A(n−1)x(n−1)+w(n)
since
w
(
n
)
\mathbf{w}(n)
w(n) is a zero mean white noise process (and the values of
w
(
n
)
\mathbf{w}(n)
w(n) are unknown), then we may predict
x
(
n
)
\mathbf{x}(n)
x(n) as follows,
x
^
(
n
∣
n
−
1
)
=
A
(
n
−
1
)
x
^
(
n
−
1
∣
n
−
1
)
(DKF.6)
\hat{\mathbf{x}}(n|n-1)=\mathbf{A}(n-1) \hat{\mathbf{x}}(n-1| n-1)\tag{DKF.6}
x^(n∣n−1)=A(n−1)x^(n−1∣n−1)(DKF.6)
which has an estimation error given by
e
(
n
∣
n
−
1
)
=
x
(
n
)
−
x
^
(
n
∣
n
−
1
)
=
A
(
n
−
1
)
x
(
n
−
1
)
+
w
(
n
)
−
A
(
n
−
1
)
x
^
(
n
−
1
∣
n
−
1
)
=
A
(
n
−
1
)
e
(
n
−
1
∣
n
−
1
)
+
w
(
n
)
(DKF.7)
\begin{aligned} \mathbf{e}(n |n-1) &=\mathbf{x}(n)-\hat{\mathbf{x}}(n|n-1) \\ &=\mathbf{A}(n-1) \mathbf{x}(n-1)+\mathbf{w}(n)-\mathbf{A}(n-1) \hat{\mathbf{x}}(n-1 | n-1) \\ &=\mathbf{A}(n-1) \mathbf{e}(n-1 | n-1)+\mathbf{w}(n) \end{aligned}\tag{DKF.7}
e(n∣n−1)=x(n)−x^(n∣n−1)=A(n−1)x(n−1)+w(n)−A(n−1)x^(n−1∣n−1)=A(n−1)e(n−1∣n−1)+w(n)(DKF.7)
Since the estimation error
e
(
n
−
1
∣
n
−
1
)
\mathbf{e}(n-1 |n-1)
e(n−1∣n−1) is uncorrelated with
w
(
n
)
\mathbf{w}(n)
w(n) (a consequence of the fact that
w
(
n
)
\mathbf w(n)
w(n) is a white noise sequence), then
P
(
n
∣
n
−
1
)
=
A
(
n
−
1
)
P
(
n
−
1
∣
n
−
1
)
A
H
(
n
−
1
)
+
Q
w
(
n
)
(DKF.8)
\mathbf{P}(n |n-1)=\mathbf{A}(n-1) \mathbf{P}(n-1 | n-1) \mathbf{A}^{H}(n-1)+\mathbf{Q}_{w}(n)\tag{DKF.8}
P(n∣n−1)=A(n−1)P(n−1∣n−1)AH(n−1)+Qw(n)(DKF.8)
In the second step, we incorporate the new measurement
y
(
n
)
\mathbf y(n)
y(n) into the estimate
x
^
(
n
∣
n
−
1
)
\hat {\mathbf x}(n|n-1)
x^(n∣n−1).
A linear estimate of
x
(
n
)
\mathbf x(n)
x(n) that is based on
x
^
(
n
∣
n
−
1
)
\hat {\mathbf x}(n|n-1)
x^(n∣n−1) and
y
(
n
)
\mathbf y(n)
y(n) is of the form
x
^
(
n
∣
n
)
=
K
′
(
n
)
x
^
(
n
∣
n
−
1
)
+
K
(
n
)
y
(
n
)
(DKF.9)
\hat {\mathbf x}(n|n)=\mathbf K'(n)\hat {\mathbf x}(n|n-1)+\mathbf K(n)\mathbf y(n)\tag{DKF.9}
x^(n∣n)=K′(n)x^(n∣n−1)+K(n)y(n)(DKF.9)
where
K
(
n
)
\mathbf K(n)
K(n) and
K
′
(
n
)
\mathbf K'(n)
K′(n) are matrices to be specified.
- Requirement 1: x ^ ( n ∣ n ) , x ^ ( n ∣ n − 1 ) \hat {\mathbf x}(n|n),\hat {\mathbf x}(n|n-1) x^(n∣n),x^(n∣n−1) are unbiased
- Requirement 2: x ^ ( n ∣ n ) \hat {\mathbf x}(n|n) x^(n∣n) minimize the mean-square error, E ∥ e ( n ∣ n ) ∥ 2 = t r ( P ( n ∣ n ) ) E\|\mathbf e(n|n)\|^2=\mathrm{tr}(\mathbf P(n|n)) E∥e(n∣n)∥2=tr(P(n∣n))
From Requirement 1,
E
(
x
(
n
)
)
=
K
′
(
n
)
E
(
x
(
n
)
)
+
K
(
n
)
(
C
(
n
)
E
(
x
(
n
)
)
+
E
(
v
(
n
)
)
)
E(\mathbf x (n))=\mathbf K'(n)E(\mathbf x (n))+\mathbf K(n)(\mathbf C(n)E(\mathbf x (n))+E(\mathbf v (n)))
E(x(n))=K′(n)E(x(n))+K(n)(C(n)E(x(n))+E(v(n)))
Since
E
(
v
(
n
)
)
=
0
E(\mathbf v (n))=\mathbf 0
E(v(n))=0, we have
K
′
(
n
)
=
I
−
K
(
n
)
C
(
n
)
(DKF.10)
\mathbf K'(n)=\mathbf I -\mathbf K(n) \mathbf C(n) \tag{DKF.10}
K′(n)=I−K(n)C(n)(DKF.10)
Substituting
(
D
K
F
.
10
)
(DKF.10)
(DKF.10) into
(
D
K
F
.
9
)
(DKF.9)
(DKF.9), we have
x
^
(
n
∣
n
)
=
x
^
(
n
∣
n
−
1
)
+
K
(
n
)
[
y
(
n
)
−
C
(
n
)
x
^
(
n
∣
n
−
1
)
]
(DKF.11)
\hat {\mathbf x}(n|n)=\hat {\mathbf x}(n|n-1)+\mathbf K(n)\big[\mathbf y(n)-\mathbf C(n)\hat {\mathbf x}(n|n-1) \big]\tag{DKF.11}
x^(n∣n)=x^(n∣n−1)+K(n)[y(n)−C(n)x^(n∣n−1)](DKF.11)
and the error
e
(
n
∣
n
)
=
K
′
(
n
)
e
(
n
∣
n
−
1
)
−
K
(
n
)
v
(
n
)
=
[
I
−
K
(
n
)
C
(
n
)
]
e
(
n
∣
n
−
1
)
−
K
(
n
)
v
(
n
)
(DKF.12)
\begin{aligned} \mathbf{e}(n | n) &=\mathbf{K}^{\prime}(n) \mathbf{e}(n| n-1)-\mathbf{K}(n) \mathbf{v}(n) \\ &=[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)] \mathbf{e}(n | n-1)-\mathbf{K}(n) \mathbf{v}(n) \end{aligned}\tag{DKF.12}
e(n∣n)=K′(n)e(n∣n−1)−K(n)v(n)=[I−K(n)C(n)]e(n∣n−1)−K(n)v(n)(DKF.12)
Thus, the error covariance matrix for
e
(
n
∣
n
)
\mathbf{e}(n \mid n)
e(n∣n) is
P
(
n
∣
n
)
=
E
{
e
(
n
∣
n
)
e
H
(
n
∣
n
)
}
=
[
I
−
K
(
n
)
C
(
n
)
]
P
(
n
∣
n
−
1
)
[
I
−
K
(
n
)
C
(
n
)
]
H
+
K
(
n
)
Q
v
(
n
)
K
H
(
n
)
(DKF.13)
\begin{aligned} \mathbf{P}(n | n) &=E\left\{\mathbf{e}(n | n) \mathbf{e}^{H}(n | n)\right\} \\ &=[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)] \mathbf{P}(n | n-1)[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)]^{H}+\mathbf{K}(n) \mathbf{Q}_{v}(n) \mathbf{K}^{H}(n) \end{aligned}\tag{DKF.13}
P(n∣n)=E{e(n∣n)eH(n∣n)}=[I−K(n)C(n)]P(n∣n−1)[I−K(n)C(n)]H+K(n)Qv(n)KH(n)(DKF.13)
Next, we must find the value for the Kalman gain
K
(
n
)
\mathbf{K}(n)
K(n) that minimizes the mean-square error
ξ
(
n
)
=
tr
{
P
(
n
∣
n
)
}
\xi(n)=\operatorname{tr}\{\mathbf{P}(n | n)\}
ξ(n)=tr{P(n∣n)}
Differentiating
ξ
(
n
)
\xi(n)
ξ(n) with respect to
K
(
n
)
,
\mathbf{K}(n),
K(n), setting the derivative to zero,
d
d
K
tr
{
P
(
n
∣
n
)
}
=
−
2
[
I
−
K
(
n
)
C
(
n
)
]
P
(
n
∣
n
−
1
)
C
H
(
n
)
+
2
K
(
n
)
Q
v
(
n
)
=
0
(DKF.14)
\frac{d}{d \mathbf{K}} \operatorname{tr}\{\mathbf{P}(n |n)\}=-2[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)] \mathbf{P}(n | n-1) \mathbf{C}^{H}(n)+2 \mathbf{K}(n) \mathbf{Q}_{v}(n)=0 \tag{DKF.14}
dKdtr{P(n∣n)}=−2[I−K(n)C(n)]P(n∣n−1)CH(n)+2K(n)Qv(n)=0(DKF.14)
Solving for
K
(
n
)
\mathbf{K}(n)
K(n) gives the desired expression for the Kalman gain,
K
(
n
)
=
P
(
n
∣
n
−
1
)
C
H
(
n
)
[
C
(
n
)
P
(
n
∣
n
−
1
)
C
H
(
n
)
+
Q
v
(
n
)
]
−
1
(DKF.15)
\mathbf{K}(n)=\mathbf{P}(n | n-1) \mathbf{C}^{H}(n)\left[\mathbf{C}(n) \mathbf{P}(n | n-1) \mathbf{C}^{H}(n)+\mathbf{Q}_{v}(n)\right]^{-1}\tag{DKF.15}
K(n)=P(n∣n−1)CH(n)[C(n)P(n∣n−1)CH(n)+Qv(n)]−1(DKF.15)
Having found the Kalman gain vector, we may simplify the expression given in
(
D
K
F
.
13
)
(DKF.13)
(DKF.13) for the error covariance. First, we rewrite the expression for
P
(
n
∣
n
)
\mathbf{P}(n | n)
P(n∣n) as follows,
P
(
n
∣
n
)
=
[
I
−
K
(
n
)
C
(
n
)
]
P
(
n
∣
n
−
1
)
−
{
[
I
−
K
(
n
)
C
(
n
)
]
P
(
n
∣
n
−
1
)
C
H
(
n
)
+
K
(
n
)
Q
v
(
n
)
}
K
H
(
n
)
\begin{aligned} \mathbf{P}(n | n)=&[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)] \mathbf{P}(n | n-1) \\ &-\left\{[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)] \mathbf{P}(n | n-1) \mathbf{C}^{H}(n)+\mathbf{K}(n) \mathbf{Q}_{v}(n)\right\} \mathbf{K}^{H}(n) \end{aligned}
P(n∣n)=[I−K(n)C(n)]P(n∣n−1)−{[I−K(n)C(n)]P(n∣n−1)CH(n)+K(n)Qv(n)}KH(n)
From
(
D
K
F
.
14
)
(DKF.14)
(DKF.14) , however, it follows that the second term is equal to zero, which leads to the desired expression for the error covariance matrix
P
(
n
∣
n
)
=
[
I
−
K
(
n
)
C
(
n
)
]
P
(
n
∣
n
−
1
)
(DKF.16)
\mathbf{P}(n |n)=[\mathbf{I}-\mathbf{K}(n) \mathbf{C}(n)] \mathbf{P}(n | n-1) \tag{DKF.16}
P(n∣n)=[I−K(n)C(n)]P(n∣n−1)(DKF.16)
Note: another derivation method can be found in the Slides 7c_Kalman