Composite Hypothesis Testing-CSDN博客

本文链接：https://blog.csdn.net/qq_39599295/article/details/111351686

Content

Motivation:

Neyman-Pearson detectors require perfect knowledge of the PDFs
What if this information is unknown?
Are there detectors for such scenarios? Radar, Sonar

Approach:

Design the NP detector, assuming the parameters are known
Manipulate the Test so that it is not dependent on the parameters

Example: DC Level in WGN with Unknown Amplitude (A>0)

Consider the DC level in WGN detection problem
$\begin{array}{ll} \mathcal{H}_{0}: x[n]=w[n] & n=0,1, \ldots, N-1 \\ \mathcal{H}_{1}: x[n]=A+w[n] & n=0,1, \ldots, N-1 \end{array}$
where the value of $A$ is unknown, although a priori we know that $A > 0,$ and $w [n]$ is WGN with variance $\sigma^{2}$ . Then, the NP test is to decide $\mathcal{H}_{1}$ if
$\frac{p\left(\mathbf{x} ; A, \mathcal{H}_{1}\right)}{p\left(\mathbf{x} ; \mathcal{H}_{0}\right)}=\frac{\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1}(x[n]-A)^{2}\right]}{\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1} x^{2}[n]\right]}>\gamma$
Taking the logarithm we have
$\sum_{n=0}^{N-1} x[n]>\sigma^{2} \ln \gamma+\frac{N A^{2}}{2}$
since it is known that $A > 0,$ we have
$\sum_{n=0}^{N-1} x[n]>\frac{\sigma^{2}}{A} \ln \gamma+\frac{N A}{2}$
Finally, scaling by $1 / N$ produces the test
$T(\mathbf{x})=\frac{1}{N} \sum_{n=0}^{N-1} x[n]>\frac{\sigma^{2}}{N A} \ln \gamma+\frac{A}{2}=\gamma^{\prime}$
Clearly, the test statistic, which is the sample mean of the data, does not depend on $A$ .

Recall from Chapter 3 that $T(\mathbf x;\mathcal H_0)=\bar x\sim \mathcal N(0,\sigma^2/N),T(\mathbf x;\mathcal H_1)=\bar x\sim \mathcal N(A,\sigma^2/N)$ . Hence,
$P_{FA}=\Pr\{T(\mathbf x)>\gamma^\prime;\mathcal H_0 \}=Q\left(\frac{\gamma^\prime}{\sqrt{\sigma^2/N}} \right)\\ P_{D}=\Pr\{T(\mathbf x)>\gamma^\prime;\mathcal H_1 \}=Q\left(\frac{\gamma^\prime-A}{\sqrt{\sigma^2/N}} \right)=Q\left(Q^{-1}(P_{FA})-\sqrt{\frac{NA^2}{\sigma^2}}\right)$
Therefore, $P_{FA}$ (and the threshold) does not depend on $A$ , although $P_D$ depends on $A$ .

The test $(1)$ leads to the highest $P_{D}$ (remember NP maximizes $P_{D}$ ) for any value $A$ . (as long as $A > 0$ ). Such a test is called a Uniformly Most Powerful (UMP) test. Any other test will have a poorer performance.

在这里插入图片描述

Unfortunately, UMP tests seldom exist.

Example: DC Level in WGN with Unknown Amplitude

Reconsider the example above with $-\infty<A<\infty$ . If we assume perfect knowledge of $A$ to design a NP detector, then it termed as a clairvoyant detector.

When $A$ can take on positive and negative values, the clairvoyant detector decides $\mathcal{H}_{1}$ if
$\begin{aligned} \frac{1}{N} \sum_{n=0}^{N-1} x[n]=\bar{x}>\gamma_{+}^{\prime} \quad \text { for } A>0 \\ \frac{1}{N} \sum_{n=0}^{N-1} x[n]=\bar{x}<\gamma_{-}^{\prime}\quad \text { for } A<0 \end{aligned}$
The detector is clearly unrealizable since it is composed of two different NP tests, the choice of which depends upon the unknown parameter $A$ . It provides an upper bound on performance, which can be found as follows.
$\begin{aligned} &P_{F A}=\operatorname{Pr}\left\{\bar{x}>\gamma_{+}^{\prime} ; \mathcal{H}_{0}\right\}=Q\left(\frac{\gamma_{+}^{\prime}}{\sqrt{\sigma^{2} / N}}\right) &&\text { for } A>0\\ &P_{F A}=\operatorname{Pr}\left\{\bar{x}<\gamma_{-}^{\prime} ; \mathcal{H}_{0}\right\}=1-Q\left(\frac{\gamma_{-}^{\prime}}{\sqrt{\sigma^{2} / N}}\right)=Q\left(\frac{-\gamma_{-}^{\prime}}{\sqrt{\sigma^{2} / N}}\right) &&\text { for } A<0 \end{aligned}$

$\begin{aligned} &P_{D}=\operatorname{Pr}\left\{\bar{x}>\gamma_{+}^{\prime} ; \mathcal{H}_{1}\right\}=Q\left(\frac{\gamma_{+}^{\prime}-A}{\sqrt{\sigma^{2} / N}}\right)=Q\left(Q^{-1}\left(P_{F A}\right)-\sqrt{\frac{N A^{2}}{\sigma^{2}}}\right) && \text { for } A>0\\ &P_{D}=1-Q\left(\frac{\gamma_{-}^{\prime}-A}{\sqrt{\sigma^{2} / N}}\right)=Q\left(\frac{-\gamma_{-}^{\prime}+A}{\sqrt{\sigma^{2} / N}}\right)=Q\left(Q^{-1}\left(P_{F A}\right)+\frac{A}{\sqrt{\sigma^{2} / N}}\right)&& \text { for } A<0 \end{aligned}$

在这里插入图片描述

Instead of the clairvoyant detector, let’s look at the realizable detector:
$T(\mathbf x)=\left|\frac{1}{N}\sum_{n=0}^{N-1}x[n] \right|>\gamma^{\prime \prime}$
Then the detection performance
$\begin{aligned} P_{F A}&=\operatorname{Pr}\left\{|\bar{x}|>\gamma^{\prime \prime} ; \mathcal{H}_{0}\right\}=2 \operatorname{Pr}\left\{\bar{x}>\gamma^{\prime \prime} ; \mathcal{H}_{0}\right\}=2 Q\left(\frac{\gamma^{\prime \prime}}{\sqrt{\sigma^{2} / N}}\right) \\ \gamma^{\prime \prime}&=\sqrt{\sigma^{2} / N} Q^{-1}\left(P_{F A} / 2\right) \\ P_{D}=\operatorname{Pr}\left\{|\bar{x}|>\gamma^{\prime \prime} ; \mathcal{H}_{1}\right\}&=Q\left(Q^{-1}\left(P_{F A} / 2\right)-\frac{A}{\sqrt{\sigma^{2} / N}}\right)+Q\left(Q^{-1}\left(P_{F A} / 2\right)+\frac{A}{\sqrt{\sigma^{2} / N}}\right) \end{aligned}$

在这里插入图片描述

The performance of this realizable detector is thus not optimal, but close to the optimal clairvoyant detector.

In fact, the proposed detector is an example of a more general approach to composite hypothesis testing, the generalized likelihood ratio test, which is described in the next section.

Composite Hypothesis Testing Approaches

Bayesian Approach

The Bayesian approach assigns prior PDFs to $\boldsymbol\theta_{0}$ and $\boldsymbol\theta_{1}$ . In doing so it models the unknown parameters as realizations of a vector random variable. If the prior PDFs are denoted by $p\left(\boldsymbol\theta_{0}\right)$ and $p\left(\boldsymbol\theta_{1}\right),$ respectively, the PDFs of the data are
$\begin{aligned} p\left(\mathbf{x} ; \mathcal{H}_{0}\right) &=\int p\left(\mathbf{x} |\boldsymbol{\theta}_{0} ; \mathcal{H}_{0}\right) p\left(\boldsymbol{\theta}_{0}\right) d \boldsymbol{\theta}_{0} \\ p\left(\mathbf{x} ; \mathcal{H}_{1}\right) &=\int p\left(\mathbf{x} |\boldsymbol{\theta}_{1} ; \mathcal{H}_{1}\right) p\left(\boldsymbol{\theta}_{1}\right) d \boldsymbol{\theta}_{1} \end{aligned}$
where $p\left(\mathbf{x} |\boldsymbol\theta_{i} ; \mathcal{H}_{i}\right)$ is the conditional PDF of $\mathbf{x},$ conditioned on $\boldsymbol{\theta}_{i},$ assuming $\mathcal{H}_{i}$ is true. The unconditional PDFs $p\left(\mathbf{x} ; \mathcal{H}_{0}\right)$ and $p\left(\mathbf{x} ; \mathcal{H}_{1}\right)$ are now completely specified, no longer dependent on the unknown parameters. With the Bayesian approach the optimal NP detector decides $\mathcal{H}_{1}$ if
$\frac{p\left(\mathbf{x} ; \mathcal{H}_{1}\right)}{p\left(\mathbf{x} ; \mathcal{H}_{0}\right)}=\frac{\int p\left(\mathbf{x}| \boldsymbol{\theta}_{1} ; \mathcal{H}_{1}\right) p\left(\boldsymbol{\theta}_{1}\right) d \boldsymbol{\theta}_{1}}{\int p\left(\mathbf{x} |\boldsymbol\theta_{0} ; \mathcal{H}_{0}\right) p\left(\boldsymbol\theta_{0}\right) d \boldsymbol{\theta}_{0}}>\gamma$

Need to choose prior pdf.
Integration can be difficult.

Generalized Likelihood Ratio Test (GLRT)

The GLRT replaces the unknown parameters by their maximum likelihood estimates (MLEs). In general, the GLRT decides $\mathcal{H}_{1}$ if
$L_{G}(\mathbf{x})=\frac{p\left(\mathbf{x} ; \hat{\boldsymbol\theta}_{1}, \mathcal{H}_{1}\right)}{p\left(\mathbf{x} ; \hat{\boldsymbol\theta}_{0}, \mathcal{H}_{0}\right)}>\gamma$
where $\hat{\boldsymbol\theta}_{1}$ is the MLE of $\boldsymbol\theta_{1}$ assuming $\mathcal{H}_{1}$ is true (maximizes $\left.p\left(\mathbf{x} ; \theta_{1}, \mathcal{H}_{1}\right)\right),$ and $\hat{\boldsymbol\theta}_{0}$ is the MLE of $\boldsymbol\theta_{0}$ assuming $\mathcal{H}_{0}$ is true (maximizes $ p\left(\mathbf{x} ; \boldsymbol{\theta}{0}, \mathcal{H}{0}\right)$).

The GLRT can also be expressed in another form, which is sometimes more convenient. since $\hat{\boldsymbol\theta}_{i}$ is the MLE under $\mathcal{H}_{i},$ it maximizes $p\left(\mathbf{x} ; \boldsymbol{\theta}_{i}, \mathcal{H}_{i}\right)$ or
$p\left(\mathbf{x} ; \hat{\boldsymbol{\theta}}_{i}, {\mathcal { H }}_{i}\right)=\max _{\boldsymbol{\theta}_{i}} p\left(\mathbf{x} ; \boldsymbol{\theta}_{i}, \mathcal{H}_{i}\right)$
Hence, $L_G(\mathbf x)$ can be written as
$L_G(\mathbf x)=\frac{\max _{\boldsymbol{\theta}_{1}} p\left(\mathbf{x} ; \boldsymbol{\theta}_{1}, \mathcal{H}_{1}\right)}{\max _{\boldsymbol{\theta}_{0}} p\left(\mathbf{x} ; \boldsymbol{\theta}_{0}, \mathcal{H}_{0}\right)}$
The approach also provides information about the unknown parameters since the first step in determining $L_{G}(\mathbf{x})$ is to find the MLEs. We now continue the DC level in WGN example.

Example: DC Level in WGN with Unknown Amplitude - GLRT

In this case we have $\boldsymbol\theta_{1}=A$ and there are no unknown parameters under $\mathcal{H}_{0}$ . The hypothesis test becomes
$\begin{array}{l} \mathcal{H}_{0}: A=0 \\ \mathcal{H}_{1}: A \neq 0 \end{array}$
Thus, the GLRT decides $\mathcal{H}_{1}$ if
$L_{G}(\mathbf{x})=\frac{p\left(\mathbf{x} ; \hat{A}, \mathcal{H}_{1}\right)}{p\left(\mathbf{x} ; \mathcal{H}_{0}\right)}>\gamma$
The MLE of $A$ is found by maximizing
$p\left(\mathbf{x} ; A, \mathcal{H}_{1}\right)=\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1}(x[n]-A)^{2}\right]$
By differentiating the likelihood (or loglikelihood) function and setting the derivative to zero, wen obtain the MLE $\hat{A}=\bar{x}$ . Thus,
$L_{G}(\mathbf{x})=\frac{\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1}(x[n]-\bar{x})^{2}\right]}{\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left(-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1} x^{2}[n]\right)}$
Taking logarithms we have
$\begin{aligned} \ln L_{G}(\mathbf{x}) &=-\frac{1}{2 \sigma^{2}}\left(\sum_{n=0}^{N-1} x^{2}[n]-2 \bar{x} \sum_{n=0}^{N-1} x[n]+N \bar{x}^{2}-\sum_{n=0}^{N-1} x^{2}[n]\right) \\ &=-\frac{1}{2 \sigma^{2}}\left(-2 N \bar{x}^{2}+N \bar{x}^{2}\right) \\ &=\frac{N \bar{x}^{2}}{2 \sigma^{2}} \end{aligned}$
or we decide $\mathcal{H}_{1}$ if
$|\bar{x}|>\gamma^{\prime}$
This detector is identical to realizable detector we looked at before and the performance has already been given.

Example: DC Level in WGN with Unknown Amplitude and Variance - GLRT

Consider the detection problem
$\begin{array}{ll} \mathcal{H}_{0}: x[n]=w[n] & n=0,1, \ldots, N-1 \\ \mathcal{H}_{1}: x[n]=A+w[n] & n=0,1, \ldots, N-1 \end{array}$
where $A$ is unknown with $-\infty<A<\infty$ and $w [n]$ is WGN with unknown variance $\sigma^{2}$ . A UMP test does not exist because the equivalent parameter test is
$\begin{array}{l} \mathcal{H}_{0}: A=0, \sigma^{2}>0 \\ \mathcal{H}_{1}: A \neq 0, \sigma^{2}>0 \end{array}$
which is two-sided. The GLRT decides $\mathcal{H}_{1}$ if
$L_{G}(\mathbf{x})=\frac{p\left(\mathbf{x} ; \hat{A}, \hat{\sigma}_{1}^{2}, \mathcal{H}_{1}\right)}{p\left(\mathbf{x} ; \hat{\sigma}_{0}^{2}, \mathcal{H}_{0}\right)}>\gamma$
where $[\hat A~~\hat{\sigma}^2]^T$ is the MLE of the vector parameter $\boldsymbol \theta_1=[A~~\sigma^2]^T$ under $\mathcal H_1$ , and $\hat{\sigma}^2_0$ is the MLE of the parameter $\boldsymbol \theta_0=\sigma^2$ under $\mathcal H_0$ . Note that we need to estimate the variance under both hypotheses.

Since
$p\left(\mathbf{x} ; A, \sigma^{2}, \mathcal{H}_{1}\right)=\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1}(x[n]-A)^{2}\right]\\ p\left(\mathbf{x} ; A, \sigma^{2}, \mathcal{H}_{1}\right)=\frac{1}{\left(2 \pi \sigma^{2}\right)^{\frac{N}{2}}} \exp \left[-\frac{1}{2 \sigma^{2}} \sum_{n=0}^{N-1}x^{2}[n]\right]$
Similar as before, we have
$\hat A=\bar x\\ \hat {\sigma}^2_1=\frac{1}{N}\sum_{n=1}^{N-1}(x[n]-\bar x)^2\\ \hat {\sigma}^2_0=\frac{1}{N}\sum_{n=1}^{N-1}x^2[n]$
Thus the GLRT becomes
$L_{G}(\mathbf{x})=\left(\frac{\hat{\sigma}_{0}^{2}}{\hat{\sigma}_{1}^{2}}\right)^{N / 2}$
In essence, the GLRT decides $\mathcal H_1$ if the fit to the data of the signal $\hat A= \bar x$ produces a much smaller error, as measured by $\hat {\sigma}^2_1=(1/N)\sum_{n=0}^{N-1}(x[n]-\hat A)^2$ than a fit of no signal or $\hat {\sigma}^2_0=(1/N)\sum_{n=0}^{N-1}x^2[n]$ . A slightly more intuitive form can be found as follows. Since
$\hat{\sigma}_{1}^{2}=\frac{1}{N}\sum_{n=1}^{N-1}x^2[n]-\bar x^2=\hat{\sigma}_{0}^{2}-\bar x ^2$
we have
$2\ln L_G(\mathbf x)=N\ln \left(\frac{\hat{\sigma}_{1}^{2}+\bar x^2}{\hat{\sigma}_{1}^{2}} \right)=N\ln \left(1+\frac{\bar x^2}{\hat{\sigma}_{1}^{2}} \right)$

Since $\ln(1 + x)$ is monotonically increasing with increasing $x$ , an equivalent test statistic is
$T(\mathrm{x})=\frac{\hat{A}_{1}^{2}}{\hat{\sigma}_{1}^{2}}$