Reference:
Kay S M. Fundamentals of statistical signal processing[M]. Prentice Hall PTR, 1993. (Vol.2 Ch. 5 - 5.6)
Slides of ET4386, TUD
Content
In the previous chapter we were able to detect deterministic signals in the presents of noise. However, in some cases it is unrealistic to assume that the signal is known (e.g., speech). A better approach is to assume that the signal is a random process with a known covariance structure.
Example: Energy Detector
We start with a simplest random signal detection problem:
H
0
:
x
[
n
]
=
w
[
n
]
n
=
0
,
1
,
⋯
,
N
−
1
H
1
:
x
[
n
]
=
s
[
n
]
+
w
[
n
]
n
=
0
,
1
,
⋯
,
N
−
1
\begin{aligned} &\mathcal H_0:x[n]=w[n]\quad n=0,1,\cdots,N-1\\ &\mathcal H_1:x[n]=s[n]+w[n]\quad n=0,1,\cdots,N-1 \end{aligned}
H0:x[n]=w[n]n=0,1,⋯,N−1H1:x[n]=s[n]+w[n]n=0,1,⋯,N−1
where
- s [ n ] s[n] s[n] is a zero mean, WSS Gaussian random process with variance σ s 2 \sigma^2_s σs2, i.e., s ∼ N ( 0 , σ s 2 I ) \mathbf s\sim \mathcal N(\mathbf 0, \sigma^2_s \mathbf I) s∼N(0,σs2I)
- w [ n ] w[n] w[n] is WGN with variance σ 2 \sigma^2 σ2 and is independent of the signal, i.e., w ∼ N ( 0 , σ 2 I ) \mathbf w\sim \mathcal N(\mathbf 0,\sigma^2 \mathbf I) w∼N(0,σ2I)
Therefore, we can formulate the likelihood ratio as
L
(
x
)
=
1
[
2
π
(
σ
s
2
+
σ
2
)
]
N
/
2
exp
[
−
1
2
(
σ
s
2
+
σ
2
)
∑
n
=
0
N
−
1
x
2
[
n
]
]
1
(
2
π
σ
2
)
N
/
2
exp
[
−
1
2
σ
2
∑
n
=
0
N
−
1
x
2
[
n
]
]
L(\mathbf x)=\frac{\frac{1}{\left[2 \pi\left(\sigma_{s}^{2}+\sigma^{2}\right)\right]^{N/2}} \exp \left[-\frac{1}{2\left(\sigma_{s}^{2}+\sigma^{2}\right)} \sum_{n=0}^{N-1} x^{2}[n]\right]}{\frac{1}{\left(2 \pi \sigma^{2}\right)^{N/2}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1} x^{2}[n]\right]}
L(x)=(2πσ2)N/21exp[−2σ21∑n=0N−1x2[n]][2π(σs2+σ2)]N/21exp[−2(σs2+σ2)1∑n=0N−1x2[n]]
so that the log-likelihood ratio (LLR) becomes
l
(
x
)
=
N
2
ln
(
σ
2
σ
s
2
+
σ
2
)
−
1
2
(
1
σ
s
2
+
σ
2
−
1
σ
2
)
∑
n
=
0
N
−
1
x
2
[
n
]
=
N
2
ln
(
σ
2
σ
s
2
+
σ
2
)
+
1
2
σ
s
2
σ
2
(
σ
s
2
+
σ
2
)
∑
n
=
0
N
−
1
x
2
[
n
]
\begin{aligned} l(\mathbf{x}) &=\frac{N}{2} \ln \left(\frac{\sigma^{2}}{\sigma_{s}^{2}+\sigma^{2}}\right)-\frac{1}{2}\left(\frac{1}{\sigma_{s}^{2}+\sigma^{2}}-\frac{1}{\sigma^{2}}\right) \sum_{n=0}^{N-1} x^{2}[n] \\ &=\frac{N}{2} \ln \left(\frac{\sigma^{2}}{\sigma_{s}^{2}+\sigma^{2}}\right)+\frac{1}{2} \frac{\sigma_{s}^{2}}{\sigma^{2}\left(\sigma_{s}^{2}+\sigma^{2}\right)} \sum_{n=0}^{N-1} x^{2}[n] \end{aligned}
l(x)=2Nln(σs2+σ2σ2)−21(σs2+σ21−σ21)n=0∑N−1x2[n]=2Nln(σs2+σ2σ2)+21σ2(σs2+σ2)σs2n=0∑N−1x2[n]
Hence, we decide
H
1
\mathcal H_1
H1 if
T
(
x
)
=
∑
n
=
0
N
−
1
x
2
[
n
]
>
γ
′
(RS.1)
T(\mathbf x)=\sum_{n=0}^{N-1} x^2[n]>\gamma ^\prime \tag{RS.1}
T(x)=n=0∑N−1x2[n]>γ′(RS.1)
The NP detector computes the energy in the received data and compares it to a threshold. Hence, it is known as an energy detector.
T
(
x
)
T(\mathbf{x})
T(x) under
H
0
\mathcal{H}_{0}
H0 and
H
1
\mathcal{H}_{1}
H1 is distributed as follows
H
0
:
T
(
x
)
σ
2
∼
χ
N
2
H
1
:
T
(
x
)
σ
s
2
+
σ
2
∼
χ
N
2
\begin{aligned} \mathcal{H}_{0}: & \frac{T(\mathbf{x})}{\sigma^{2}} \sim \chi_{N}^{2} \\ \mathcal{H}_{1}: & \frac{T(\mathbf{x})}{\sigma_{s}^{2}+\sigma^{2}} \sim \chi_{N}^{2} \end{aligned}
H0:H1:σ2T(x)∼χN2σs2+σ2T(x)∼χN2
Therefore, we have
P
F
A
=
Pr
{
T
(
x
)
>
γ
′
;
H
0
}
=
Pr
{
T
(
x
)
σ
2
>
γ
′
σ
2
;
H
0
}
=
Q
χ
N
2
(
γ
′
σ
2
)
(RS.2)
\begin{aligned} P_{F A} &=\operatorname{Pr}\left\{T(\mathbf{x})>\gamma^{\prime} ; \mathcal{H}_{0}\right\} \\ &=\operatorname{Pr}\left\{\frac{T(\mathbf{x})}{\sigma^{2}}>\frac{\gamma^{\prime}}{\sigma^{2}} ; \mathcal{H}_{0}\right\} \\ &=Q_{\chi_{N}^{2}}\left(\frac{\gamma^{\prime}}{\sigma^{2}}\right) \end{aligned}\tag{RS.2}
PFA=Pr{T(x)>γ′;H0}=Pr{σ2T(x)>σ2γ′;H0}=QχN2(σ2γ′)(RS.2)
and
P
D
=
Pr
{
T
(
x
)
>
γ
′
;
H
1
}
=
Pr
{
T
(
x
)
σ
s
2
+
σ
2
>
γ
′
σ
s
2
+
σ
2
;
H
0
}
=
Q
χ
N
2
(
γ
′
σ
s
2
+
σ
2
)
=
Q
χ
N
2
(
γ
′
/
σ
2
σ
s
2
/
σ
2
+
1
)
(RS.3)
\begin{aligned} P_{D} &=\operatorname{Pr}\left\{T(\mathbf{x})>\gamma^{\prime} ; \mathcal{H}_{1}\right\} \\ &=\operatorname{Pr}\left\{\frac{T(\mathbf{x})}{\sigma_{s}^{2}+\sigma^{2}}>\frac{\gamma^{\prime}}{\sigma_{s}^{2}+\sigma^{2}} ; \mathcal{H}_{0}\right\} \\ &=Q_{\chi_{N}^{2}}\left(\frac{\gamma^{\prime}}{\sigma_{s}^{2}+\sigma^{2}}\right)=Q_{\chi_{N}^{2}}\left(\frac{\gamma^{\prime}/\sigma^{2}}{\sigma_{s}^{2}/\sigma^{2}+1}\right) \end{aligned}\tag{RS.3}
PD=Pr{T(x)>γ′;H1}=Pr{σs2+σ2T(x)>σs2+σ2γ′;H0}=QχN2(σs2+σ2γ′)=QχN2(σs2/σ2+1γ′/σ2)(RS.3)
As
σ
s
2
/
σ
2
\sigma^{2}_s/\sigma^{2}
σs2/σ2 increases, the argument of the
Q
χ
N
2
Q_{\chi_{N}^{2}}
QχN2 function decreases and thus
P
D
P_D
PD increases.
Generalization 1: Signals With Arbitrary Covariance Matrices
Assume:
- s [ n ] s[n] s[n] is a zero mean, Gaussian random process with covariance matrix C s \mathbf C_s Cs, i.e., s ∼ N ( 0 , C s ) \mathbf s\sim \mathcal N(\mathbf 0,\mathbf C_s) s∼N(0,Cs)
- w [ n ] w[n] w[n] is WGN with variance σ 2 \sigma^2 σ2 and is independent of the signal, i.e., w ∼ N ( 0 , σ 2 I ) \mathbf w\sim \mathcal N(\mathbf 0,\sigma^2 \mathbf I) w∼N(0,σ2I)
Hence:
x
∼
{
N
(
0
,
σ
2
I
)
under
H
0
N
(
0
,
C
s
+
σ
2
I
)
under
H
1
\mathbf x\sim \left\{\begin{matrix}& \mathcal N(\mathbf 0,\sigma^2\mathbf I) && \text{under }\mathcal H_0\\ & \mathcal N(\mathbf 0,\mathbf C_s+\sigma^2\mathbf I) && \text{under }\mathcal H_1 \end{matrix}\right.
x∼{N(0,σ2I)N(0,Cs+σ2I)under H0under H1
The NP detector decides
H
1
\mathcal{H}_{1}
H1 if
L
(
x
)
=
1
(
2
π
)
N
/
2
det
1
2
(
C
s
+
σ
2
I
)
exp
[
−
1
2
x
T
(
C
s
+
σ
2
I
)
−
1
x
]
1
(
2
π
σ
2
)
N
/
2
exp
(
−
1
2
σ
2
x
T
x
)
>
γ
L(\mathbf{x})=\frac{\frac{1}{(2 \pi)^{N/2} \operatorname{det}^{\frac{1}{2}}\left(\mathbf{C}_{s}+\sigma^{2} \mathbf{I}\right)} \exp \left[-\frac{1}{2} \mathbf{x}^{T}\left(\mathbf{C}_{s}+\sigma^{2} \mathbf{I}\right)^{-1} \mathbf{x}\right]}{\frac{1}{\left(2 \pi \sigma^{2}\right)^{N/2}} \exp \left(-\frac{1}{2 \sigma^{2}} \mathbf{x}^{T} \mathbf{x}\right)}>\gamma
L(x)=(2πσ2)N/21exp(−2σ21xTx)(2π)N/2det21(Cs+σ2I)1exp[−21xT(Cs+σ2I)−1x]>γ
Taking logarithms and retaining only the data-dependent terms yields
T
(
x
)
=
σ
2
x
T
[
1
σ
2
I
−
(
C
s
+
σ
2
I
)
−
1
]
x
>
2
γ
′
σ
2
T(\mathbf{x})=\sigma^{2} \mathbf{x}^{T}\left[\frac{1}{\sigma^{2}} \mathbf{I}-\left(\mathbf{C}_{s}+\sigma^{2} \mathbf{I}\right)^{-1}\right] \mathbf{x}>2 \gamma^{\prime} \sigma^{2}
T(x)=σ2xT[σ21I−(Cs+σ2I)−1]x>2γ′σ2
Using the matrix inversion lemma
(
A
+
B
C
D
)
−
1
=
A
−
1
−
A
−
1
B
(
D
A
−
1
B
+
C
−
1
)
−
1
D
A
−
1
(\mathbf{A}+\mathbf{B C D})^{-1}=\mathbf{A}^{-1}-\mathbf{A}^{-1} \mathbf{B}\left(\mathbf{D} \mathbf{A}^{-1} \mathbf{B}+\mathbf{C}^{-1}\right)^{-1} \mathbf{D} \mathbf{A}^{-1}
(A+BCD)−1=A−1−A−1B(DA−1B+C−1)−1DA−1
we have upon letting
A
=
σ
2
I
,
B
=
D
=
I
,
C
=
C
s
\mathbf{A}=\sigma^{2} \mathbf{I}, \mathbf{B}=\mathbf{D}=\mathbf{I}, \mathbf{C}=\mathbf{C}_{s}
A=σ2I,B=D=I,C=Cs
(
σ
2
I
+
C
s
)
−
1
=
1
σ
2
I
−
1
σ
4
(
1
σ
2
I
+
C
s
−
1
)
−
1
\left(\sigma^{2} \mathbf{I}+\mathbf{C}_{s}\right)^{-1}=\frac{1}{\sigma^{2}} \mathbf{I}-\frac{1}{\sigma^{4}}\left(\frac{1}{\sigma^{2}} \mathbf{I}+\mathbf{C}_{s}^{-1}\right)^{-1}
(σ2I+Cs)−1=σ21I−σ41(σ21I+Cs−1)−1
so that
T
(
x
)
=
x
T
[
1
σ
2
(
1
σ
2
I
+
C
s
−
1
)
−
1
]
x
(RS.4)
T(\mathbf{x})=\mathbf{x}^{T}\left[\frac{1}{\sigma^{2}}\left(\frac{1}{\sigma^{2}} \mathbf{I}+\mathbf{C}_{s}^{-1}\right)^{-1}\right] \mathbf{x}\tag{RS.4}
T(x)=xT[σ21(σ21I+Cs−1)−1]x(RS.4)
Now let
s
^
=
(
1
/
σ
2
)
(
(
1
/
σ
2
)
I
+
C
s
−
1
)
−
1
x
.
\hat{\mathbf{s}}=\left(1 / \sigma^{2}\right)\left(\left(1 / \sigma^{2}\right) \mathbf{I}+\mathbf{C}_{s}^{-1}\right)^{-1} \mathbf{x} .
s^=(1/σ2)((1/σ2)I+Cs−1)−1x. This may be also written as
s
^
=
C
s
(
C
s
+
σ
2
I
)
−
1
x
(RS.5)
\hat{\mathbf{s}}=\mathbf C_s(\mathbf C_s+\sigma^2\mathbf I)^{-1}\mathbf x\tag{RS.5}
s^=Cs(Cs+σ2I)−1x(RS.5)
Hence, we decide
H
1
\mathcal H_1
H1 if
T
(
x
)
=
x
T
s
^
=
∑
n
=
0
N
−
1
x
[
n
]
s
^
[
n
]
(RS.6)
T(\mathbf x)=\mathbf x^T \hat {\mathbf s}=\sum_{n=0}^{N-1}x[n]\hat s[n]\tag{RS.6}
T(x)=xTs^=n=0∑N−1x[n]s^[n](RS.6)
Estimator-Correlator
The NP detector correlates the received data with an estimate of the signal, i.e., s ^ [ n ] \hat s[n] s^[n]. It is therefore termed an estimator-correlator.
We claim that
s
^
\hat{\mathbf s}
s^ is actually a Wiener filter estimator of the signal. To see this, recall that if
θ
\boldsymbol \theta
θ is an unknown random vector whose realization is to be estimated based on the data vector
x
\mathbf x
x, and
θ
\boldsymbol \theta
θ and
x
\mathbf x
x are jointly Gaussian with zero mean, the minimum mean square error (MMSE) estimator is
θ
^
=
C
θ
x
C
x
x
−
1
x
=
w
T
x
\hat{\boldsymbol \theta}=\mathbf C_{\theta x}\mathbf C_{xx}^{-1}\mathbf x =\mathbf w^T\mathbf x
θ^=CθxCxx−1x=wTx
where
w
=
C
x
x
−
1
C
θ
x
\mathbf w=\mathbf C_{xx}^{-1}\mathbf C_{\theta x}
w=Cxx−1Cθx, which is known as the Wiener filter. The MMSE estimator is linear due to the jointly Gaussian assumption.
Here we have
θ
=
s
\boldsymbol \theta=\mathbf s
θ=s and
x
−
s
+
w
\mathbf x-\mathbf s+\mathbf w
x−s+w, with
s
\mathbf s
s and
w
\mathbf w
w uncorrelated. The MMSE estimate of the signal realization is from
s
^
=
E
[
s
(
s
+
w
)
T
]
(
E
[
(
s
+
w
)
(
s
+
w
)
T
]
)
−
1
x
=
C
s
(
C
s
+
σ
2
I
)
−
1
x
\hat {\mathbf s}=E[\mathbf s(\mathbf s+\mathbf w)^T](E[(\mathbf s+\mathbf w)(\mathbf s+\mathbf w)^T])^{-1}\mathbf x=\mathbf C_s(\mathbf C_s+\sigma^2\mathbf I)^{-1}\mathbf x
s^=E[s(s+w)T](E[(s+w)(s+w)T])−1x=Cs(Cs+σ2I)−1x
which is identical to
(
R
S
.
5
)
(RS.5)
(RS.5).
Canonical Form
For any
C
s
∈
S
+
\mathbf C_s\in \mathbf S_+
Cs∈S+, let the eigendecomposition of the
N
×
N
N\times N
N×N covariance matrix
C
s
\mathbf C_s
Cs be
C
s
=
V
Λ
s
V
T
\mathbf C_s=\mathbf V\boldsymbol \Lambda_s \mathbf V^T
Cs=VΛsVT
where
V
=
[
v
0
v
1
⋯
v
N
−
1
]
\mathbf V=[\mathbf v_0~~\mathbf v_1~~\cdots ~~\mathbf v_{N-1}]
V=[v0 v1 ⋯ vN−1] for
v
i
\mathbf v_i
vi the
i
i
ith eigenvector of
C
s
\mathbf C_s
Cs and
Λ
s
=
d
i
a
g
(
λ
s
0
,
λ
s
1
,
⋯
,
λ
s
N
−
1
)
\boldsymbol \Lambda_s=\mathrm{diag}(\lambda_{s_0},\lambda_{s_1},\cdots, \lambda_{s_{N-1}})
Λs=diag(λs0,λs1,⋯,λsN−1) for
λ
s
i
\lambda_{s_i}
λsi the corresponding eigenvalue.
Instead of expressing the test statistic in terms of
x
\mathbf x
x, it is advantageous to let
y
=
V
T
x
\mathbf y=\mathbf V^T\mathbf x
y=VTx, so that
T
(
x
)
=
x
T
V
Λ
s
V
T
(
V
Λ
s
V
T
+
σ
2
V
V
T
)
−
1
x
=
x
T
V
Λ
s
(
Λ
s
+
σ
2
I
)
−
1
V
T
x
=
y
T
Λ
s
(
Λ
s
+
σ
2
I
)
−
1
y
=
∑
n
=
0
N
−
1
λ
s
n
λ
s
n
+
σ
2
y
2
[
n
]
(RS.7)
\begin{aligned} T(\mathbf x)&=\mathbf x^T \mathbf V \boldsymbol \Lambda_s\mathbf V^T( \mathbf V \boldsymbol \Lambda_s\mathbf V^T+\sigma^2 \mathbf V \mathbf V^T)^{-1}\mathbf x\\ &=\mathbf x^T \mathbf V \boldsymbol \Lambda_s(\boldsymbol \Lambda_s+\sigma^2 \mathbf I)^{-1}\mathbf V^T \mathbf x\\ &=\mathbf y^T \boldsymbol \Lambda_s(\boldsymbol \Lambda_s+\sigma^2 \mathbf I)^{-1}\mathbf y\\ &=\sum_{n=0}^{N-1}\frac{\lambda_{s_n}}{\lambda_{s_n}+\sigma^2}y^2[n] \end{aligned}\tag{RS.7}
T(x)=xTVΛsVT(VΛsVT+σ2VVT)−1x=xTVΛs(Λs+σ2I)−1VTx=yTΛs(Λs+σ2I)−1y=n=0∑N−1λsn+σ2λsny2[n](RS.7)
This is the canonical form of the detector which is shown in Figure 5.3.
The effect of the filter on an input vector x \mathbf x x is equivalent to first transforming x \mathbf x x to y = V T x \mathbf y = \mathbf V^T\mathbf x y=VTx, filtering each component of y \mathbf y y by λ s n / ( λ s n + σ 2 ) \lambda_{s_n}/(\lambda_{s_n}+\sigma^2) λsn/(λsn+σ2) to form y ′ \mathbf y^\prime y′, and then transforming back to the original space using x ′ = V y ′ \mathbf x^\prime = \mathbf V \mathbf y^\prime x′=Vy′.
Remark:
Now assume that
N
=
2
N=2
N=2 and
C
s
=
σ
s
2
[
1
ρ
ρ
1
]
=
V
Λ
s
V
T
\mathbf C_s=\sigma^2_s\left[\begin{matrix}1 & \rho\\ \rho & 1 \end{matrix} \right]=\mathbf V\boldsymbol \Lambda_s \mathbf V^T
Cs=σs2[1ρρ1]=VΛsVT
where
V
=
[
1
2
1
2
1
2
−
1
2
]
=
[
v
1
v
2
]
,
Λ
s
=
[
σ
s
2
(
1
+
ρ
)
0
0
σ
s
2
(
1
−
ρ
)
]
\mathbf V=\left[\begin{aligned}&\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ &\frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}} \end{aligned} \right]=[\mathbf v_1~~\mathbf v_2], \boldsymbol \Lambda_s=\left[\begin{aligned}&\sigma^2_s(1+\rho) & 0\\ &0 & \sigma^2_s(1-\rho) \end{aligned} \right]
V=⎣⎢⎢⎡212121−21⎦⎥⎥⎤=[v1 v2],Λs=[σs2(1+ρ)00σs2(1−ρ)],
ρ
\rho
ρ is the correlation coefficient between
s
[
0
]
s[0]
s[0] and
s
[
1
]
s[1]
s[1]. The detector can easily be derived as
T
(
x
)
=
σ
s
2
(
1
+
ρ
)
σ
s
2
(
1
+
ρ
)
+
σ
2
y
2
[
0
]
+
σ
s
2
(
1
−
ρ
)
σ
s
2
(
1
−
ρ
)
+
σ
2
y
2
[
1
]
T(\mathbf x)=\frac{\sigma^2_s(1+\rho)}{\sigma^2_s(1+\rho)+\sigma^2}y^2[0]+\frac{\sigma^2_s(1-\rho)}{\sigma^2_s(1-\rho)+\sigma^2}y^2[1]
T(x)=σs2(1+ρ)+σ2σs2(1+ρ)y2[0]+σs2(1−ρ)+σ2σs2(1−ρ)y2[1]
Why the contribution of
y
[
0
]
y[0]
y[0] to
T
(
x
)
T(\mathbf x)
T(x) is weighted more heavily?
To see this, the signal PDF and noise PDF are
p
(
s
)
=
1
(
2
π
)
N
/
2
det
1
2
C
s
exp
[
−
1
2
s
T
C
s
−
1
s
]
p
(
w
)
=
1
(
2
π
σ
2
)
N
/
2
exp
(
−
1
2
σ
2
w
T
w
)
p(\mathbf s)=\frac{1}{(2 \pi)^{N/2} \operatorname{det}^{\frac{1}{2}}\mathbf{C}_{s}} \exp \left[-\frac{1}{2} \mathbf{s}^{T}\mathbf{C}_{s}^{-1} \mathbf{s}\right]\\ p(\mathbf w)=\frac{1}{\left(2 \pi \sigma^{2}\right)^{N/2}} \exp \left(-\frac{1}{2 \sigma^{2}} \mathbf{w}^{T} \mathbf{w}\right)
p(s)=(2π)N/2det21Cs1exp[−21sTCs−1s]p(w)=(2πσ2)N/21exp(−2σ21wTw)
Then we can draw the contours of constant PDF for signal only and noise only:
Note that line ξ 1 = ξ 0 \xi_1=\xi_0 ξ1=ξ0 is along the direction of v 1 \mathbf v_1 v1 and the line ξ 1 = − ξ 0 \xi_1=-\xi_0 ξ1=−ξ0 is along the direction of v 2 \mathbf v_2 v2.
Let
z
=
V
T
s
\mathbf z=\mathbf V^T\mathbf s
z=VTs, the expression of constant PDF for signal is
z
T
Λ
s
−
1
z
=
z
2
[
0
]
σ
s
2
(
1
+
ρ
)
+
z
2
[
1
]
σ
s
2
(
1
−
ρ
)
\mathbf z^T \boldsymbol \Lambda_s^{-1}\mathbf z=\frac{z^2[0]}{\sigma_s^2(1+\rho)}+\frac{z^2[1]}{\sigma_s^2(1-\rho)}
zTΛs−1z=σs2(1+ρ)z2[0]+σs2(1−ρ)z2[1]
Therefore, the constant PDF is a rotated ellipsoid.
The component of x \mathbf x x along line v ^ 1 \hat {\mathbf v}_1 v^1 is likely to be much larger when the signal is present than the component along the orthogonal line v ^ 2 \hat{\mathbf v}_2 v^2. For no signal there is no preferred direction. Therefore, the contribution of y [ 0 ] y[0] y[0] to T ( x ) T(\mathbf x) T(x) is weighted more heavily. If ρ ≈ 1 \rho \approx 1 ρ≈1, the PDF of the signal is concentrated along v ^ 1 \hat {\mathbf v}_1 v^1, so y [ 0 ] y[0] y[0] is retained and y [ 1 ] y[1] y[1] is almost discarded.
Generalization 2: General Gaussian Detection
Assume:
- s [ n ] s[n] s[n] is a Gaussian random process with mean μ s \boldsymbol \mu_s μs and covariance matrix C s \mathbf C_s Cs, i.e., s ∼ N ( μ s , C s ) \mathbf s\sim \mathcal N(\boldsymbol \mu_s,\mathbf C_s) s∼N(μs,Cs)
- w [ n ] w[n] w[n] is a zero mean, Gaussian random process with covariance matrix C w \mathbf C_w Cw and is independent of the signal, i.e., w ∼ N ( 0 , C w ) \mathbf w\sim \mathcal N(\mathbf 0,\mathbf C_w) w∼N(0,Cw)
Hence
x
∼
{
N
(
0
,
C
w
)
under
H
0
N
(
μ
s
,
C
s
+
C
w
)
under
H
1
\mathbf x\sim \left\{\begin{matrix}& \mathcal N(\mathbf 0,\mathbf C_w) && \text{under }\mathcal H_0\\ & \mathcal N(\boldsymbol \mu_s,\mathbf C_s+\mathbf C_w) && \text{under }\mathcal H_1 \end{matrix}\right.
x∼{N(0,Cw)N(μs,Cs+Cw)under H0under H1
The NP detector decides
H
1
\mathcal{H}_{1}
H1 if
1
(
2
π
)
N
/
2
det
1
2
(
C
s
+
C
w
)
exp
[
−
1
2
(
x
−
μ
s
)
T
(
C
s
+
C
w
)
−
1
(
x
−
μ
s
)
]
1
(
2
π
)
N
/
2
det
1
2
(
C
w
)
exp
[
−
1
2
x
T
C
w
−
1
x
]
>
γ
\frac{\frac{1}{(2 \pi)^{N/2} \operatorname{det}^{\frac{1}{2}}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)} \exp \left[-\frac{1}{2}\left(\mathbf{x}-\boldsymbol\mu_{s}\right)^{T}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1}\left(\mathbf{x}-\boldsymbol\mu_{s}\right)\right]}{\frac{1}{(2 \pi)^{N/2} \operatorname{det}^{\frac{1}{2}}\left(\mathbf{C}_{w}\right)} \exp \left[-\frac{1}{2} \mathbf{x}^{T} \mathbf{C}_{w}^{-1} \mathbf{x}\right]}>\gamma
(2π)N/2det21(Cw)1exp[−21xTCw−1x](2π)N/2det21(Cs+Cw)1exp[−21(x−μs)T(Cs+Cw)−1(x−μs)]>γ
Taking the logarithm, retaining only the data-dependent terms, and scaling produces the test statistic
T
(
x
)
=
x
T
C
w
−
1
x
−
(
x
−
μ
s
)
T
(
C
s
+
C
w
)
−
1
(
x
−
μ
s
)
=
x
T
C
w
−
1
x
−
x
T
(
C
s
+
C
w
)
−
1
x
+
2
x
T
(
C
s
+
C
w
)
−
1
μ
s
−
μ
s
T
(
C
s
+
C
w
)
−
1
μ
s
\begin{aligned} T(\mathbf{x})=& \mathbf{x}^{T} \mathbf{C}_{w}^{-1} \mathbf{x}-\left(\mathbf{x}-\boldsymbol{\mu}_{s}\right)^{T}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1}\left(\mathbf{x}-\boldsymbol{\mu}_{s}\right) \\ =& \mathbf{x}^{T} \mathbf{C}_{w}^{-1} \mathbf{x}-\mathbf{x}^{T}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1} \mathbf{x} \\ &+2 \mathbf{x}^{T}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1} \boldsymbol{\mu}_{s}-\boldsymbol{\mu}_{s}^{T}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1} \boldsymbol{\mu}_{s} \end{aligned}
T(x)==xTCw−1x−(x−μs)T(Cs+Cw)−1(x−μs)xTCw−1x−xT(Cs+Cw)−1x+2xT(Cs+Cw)−1μs−μsT(Cs+Cw)−1μs
But from the matrix inversion lemma
C
w
−
1
−
(
C
s
+
C
w
)
−
1
=
C
w
−
1
C
s
(
C
s
+
C
w
)
−
1
\mathbf{C}_{w}^{-1}-\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1}=\mathbf{C}_{w}^{-1} \mathbf{C}_{s}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1}
Cw−1−(Cs+Cw)−1=Cw−1Cs(Cs+Cw)−1
so that by ignoring the nondata dependent term and scaling we have
T
′
(
x
)
=
x
T
(
C
s
+
C
w
)
−
1
μ
s
+
1
2
x
T
C
w
−
1
C
s
(
C
s
+
C
w
)
−
1
x
(RS.8)
T^\prime (\mathbf x)= \mathbf{x}^{T}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1} \boldsymbol{\mu}_{s}+\frac{1}{2}\mathbf x^T\mathbf{C}_{w}^{-1} \mathbf{C}_{s}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1}\mathbf x\tag{RS.8}
T′(x)=xT(Cs+Cw)−1μs+21xTCw−1Cs(Cs+Cw)−1x(RS.8)
The test statistic consists of a quadratic form as well as a linear form in
x
\mathrm{x}
x. As special cases we have
-
C
s
=
0
\mathbf{C}_{s}=\mathbf 0
Cs=0 or a deterministic signal with
s
=
μ
s
\mathbf{s}=\boldsymbol{\mu}_{s}
s=μs. Then
T ′ ( x ) = x T C w − 1 μ s (RS.9) T^{\prime}(\mathbf{x})=\mathbf{x}^{T} \mathbf{C}_{w}^{-1} \boldsymbol{\mu}_{s}\tag{RS.9} T′(x)=xTCw−1μs(RS.9)
which is our prewhitener and matched filter.
-
μ
s
=
0
\boldsymbol{\mu}_{s}=\mathbf 0
μs=0 or a random signal with
s
∼
N
(
0
,
C
s
)
.
\mathbf{s} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{C}_{s}\right) .
s∼N(0,Cs). Then
T ′ ( x ) = 1 2 x T C w − 1 C s ( C s + C w ) − 1 x = 1 2 x T C w − 1 s ^ (RS.10) \begin{aligned} T^{\prime}(\mathbf{x}) &=\frac{1}{2} \mathbf{x}^{T} \mathbf{C}_{w}^{-1} \mathbf{C}_{s}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1} \mathbf{x} \\ &=\frac{1}{2} \mathbf{x}^{T} \mathbf{C}_{w}^{-1} \hat{\mathbf{s}} \end{aligned}\tag{RS.10} T′(x)=21xTCw−1Cs(Cs+Cw)−1x=21xTCw−1s^(RS.10)
where s ^ = C s ( C s + C w ) − 1 x \hat{\mathbf{s}}=\mathbf{C}_{s}\left(\mathbf{C}_{s}+\mathbf{C}_{w}\right)^{-1} \mathbf{x} s^=Cs(Cs+Cw)−1x is the MMSE estimator of s \mathbf{s} s. This is a prewhitener followed by an estimator-correlator.