Minimum Variance Unbiased Estimation (MVU)

最新推荐文章于 2021-05-18 22:25:17 发布

拉普拉斯的汪

最新推荐文章于 2021-05-18 22:25:17 发布

阅读量800

点赞数 2

分类专栏： Estimation and Detection 文章标签：信号处理数字信号处理

本文链接：https://blog.csdn.net/qq_39599295/article/details/109694607

版权

Estimation and Detection 专栏收录该内容

9 篇文章

订阅专栏

Reference:
Kay S M. Fundamentals of statistical signal processing[M]. Prentice Hall PTR, 1993. (Chapter 2)
Slides of ET4386, TUD

Content

An Example

在这里插入图片描述

Consider a process e.g., a constant in noise
$\quad n=0, \ldots, N-1$
where, we assume

$A$ is deterministic and unknown,
$w [n]$ is a zero-mean random process with variance $\sigma^{2}$ ,
$x [n]$ is the measured data.

Potential estimators for $A$ :

$\hat{A}_{1}=x[0]$
$\hat{A}_{2}=\frac{1}{N} \sum_{n=0}^{N-1} x[n]$
$\hat{A}_{3}=\frac{a}{N} \sum_{n=0}^{N-1} x[n]$ , for some constant $a$
$\cdots$

Which estimator is good (or optimal) ?

Mean Square Error Criterion

In searching for optimal estimators, we need to adopt some optimality criterion. A natural one is the mean square error (MSE), defined as
$\mathrm{mse}(\hat \theta)=E\left[(\hat \theta-\theta)^2\right]$
To get more insight, we can rewrite MSE as
$\begin{aligned} \mathrm{mse}(\hat \theta)&=E\left[(\hat \theta-E(\hat \theta)+E(\hat \theta)-\theta)^2\right]\\ &=E[(\hat \theta-E(\hat \theta))^2]+[E(\hat \theta)-\theta]^2\\ &=\operatorname{var}(\hat \theta)+b^2(\theta) \end{aligned}$
which shows that the MSE is composed of errors due to the variance of the estimator as well as the bias. Unfortunately, adoption of this natural criterion leads to unrealizable estimators, ones that cannot be written solely as a function of the data.

For instance, consider the estimator
$\check A=a\frac{1}{N}\sum_{n=0}^{N-1}x[n]$
for our example with some constant $a$ . We will attempt to find the $a$ which results in the minimum MSE. Since $E(\check A)=a A$ and $\operatorname{var}(\check A)=a^{2} \sigma^{2} / N,$ we have
$\operatorname{mse}(\check{A})=\frac{a^{2} \sigma^{2}}{N}+(a-1)^{2} A^{2}$
Differentiating the MSE with respect to $a$ yields
$\frac{d \operatorname{mse}(\check{A})}{d a}=\frac{2 a \sigma^{2}}{N}+2(a-1) A^{2}$
which upon setting to zero and solving yields the optimum value
$a_{\mathrm{opt}}=\frac{A^{2}}{A^{2}+\sigma^{2} / N}$
It is seen that, the optimal value of $a$ depends upon the unknown parameter $A$ . The estimator is therefore not realizable.

From a practical viewpoint the minimum MSE estimator needs to be abandoned. An alternative approach is to constrain the bias to be zero and find the estimator which minimizes the variance. Such an estimator is termed the minimum variance unbiased (MVU) estimator.

Minimum Variance Unbiased Estimator

Constrain the bias of the MSE to zero, i.e., consider ${E}(\hat{\theta})=\theta,$ then
$\mathrm{m s e}(\hat{\theta})={E}\left[(\hat{\theta}-{E}(\hat{\theta}))^{2}\right]+({E}(\hat{\theta})-\theta)^{2}=\operatorname{var}(\hat{\theta})$
where $\hat{\theta}$ is an unbiased estimator, and let
$\operatorname{var}(\hat{\theta}) \leq \operatorname{var}(\tilde{\theta})$
for any other unbiased estimator $\tilde{\theta},$ then $\hat{\theta}$ is the minimum variance unbiased estimator (MVU) for all $\theta$ .

For the example, consider a more general estimator
$\hat A=\sum_{n=0}^{N-1}a_n x[n]$
To achieve unbiasedness, we should have
$\sum_{n=0}^{N-1}a_n=1$
The variance of $\hat A$ is
$\operatorname{var}(\hat A)=\sum_{n=0}^{N-1}a_n^2 \operatorname{var}(x[n])=\sigma^2\sum_{n=0}^{N-1}a_n^2$
Use Lagrangian multipliers with unbiasedness as the constraint equation. Let
$L(\mathbf a,\lambda)=\sigma^2 \mathbf a^T \mathbf a-\lambda(\mathbf 1^T\mathbf a)$
Differentiate $L$ with respect to $\mathbf a$ and set the result to zero:
$2\sigma^2\mathbf a-\lambda \mathbf 1=0$
Combine it with the constraint $\sum_{n=0}^{N-1}a_n=1$ , we obtain
$\mathbf a=\frac{1}{N}\mathbf 1,$
i.e.,
$\hat A=\frac{1}{N}\sum_{n=0}^{N-1}x[n]$

Existence of the Minimum Variance Unbiased Estimator

The question arises as to whether a MVU estimator exists, i.e., an unbiased estimator with minimum variance for all $\theta$ .

在这里插入图片描述

In general, the MVU estimator does not always exist.

Another example: Given a single observation $x [0]$ from the distribution $\mathcal{U}[0,1/\theta]$ , it is desired to estimate $\theta$ . It is assumed that $\theta >0$ . For an unbiased estimator, we must have
$\int_0^{1/\theta}\theta g(u)du=\theta\iff \int_0^{1/\theta} g(u)du=1$
Assume that we can find a function $g (u)$ such that for all $\theta>0$ , the condition above will be satisfied. Then for any $\theta_1>\theta_2>0$ , we have
$\int_0^{1/\theta_1} g(u)du=1,\int_0^{1/\theta_2} g(u)du=1 \Longrightarrow \int_{1/\theta_1}^{1/\theta_2} g(u)du=0$
Clearly, we must have $g (u) = 0$ for all $u$ , which produces a biased estimator.

Finding the Minimum Variance Unbiased Estimator

Even if a MVU estimator exists, we may not be able to find it. In the next few chapters we shall discuss several possible approaches. They are:

Determine the Cramer-Rao lower bound (CRLB) and check to see if some estimator satisfies it (Chapters 3 and 4).
Apply the Rao-Blackwell-Lehmann-Scheffe (RBLS) theorem (Chapter 5).
Further restrict the class of estimators to be not only unbiased but also linear. Then, find the minimum variance estimator within this restricted class (Chapter 6).

Appendix: Some Useful Supplements

An estimator is unbiased does not necessarily mean that it is a good estimator. It only guarantees that on the average it will attain the true value. On the other hand, biased estimators are ones that are characterized by a systematic error, which presumably should not be present. A persistent bias will always result in a poor estimator.

在这里插入图片描述

It sometimes occurs that multiple estimates of the same parameter are available, i.e., $\{\hat{\theta}_1,\hat{\theta}_2,\cdots,\hat{\theta}_n\}$ . A reasonable procedure is to combine these estimates into a better one by averaging them to form
$\theta=\frac{1}{n}\sum_{i=1}^n \hat{\theta}_i$
Assuming the estimators are unbiased, with the same variance, and uncorrelated with each other,
$E(\hat \theta)=\theta,\quad \operatorname{var}(\hat \theta)=\frac{1}{n^2}\sum_{i=1}^n \operatorname{var}(\hat {\theta}_i)=\frac{\operatorname{var}(\hat {\theta}_1)}{n}$
so that as more estimates are averaged, the variance will decrease. Ultimately, as $\to \infty, \hat \theta \to \theta$ . However, if the estimators are biased, then no matter how many estimators are averaged, $\hat \theta$ will not converge to the true value, as is shown in the figure above.

The PDF of $\hat A=\frac{1}{N} \sum_{n=0}^{N-1} x[n]$ given in the example is $\mathcal{N}(A,\sigma^2/N)$ :

Note that $w[n]\sim \mathcal{N}(0,\sigma^2)$ , then $x[n]\sim \mathcal{N}(A,\sigma^2)$ . Since $x [n]$ is independent to each other, $\hat A$ follows Gaussian distribution. It is easy to verify that $E(\hat A)=A,\operatorname{var}(\hat A)=\sigma^2/N$ . Thus
$\hat A\sim \mathcal{N}(A,\sigma^2/N)$
The estimator can be proved to be consistent, i.e., as $N\to \infty,\hat A \to A$ by showing that
$\lim_{N\to \infty}\Pr\{|\hat A-A|>\epsilon\}=0$
for any $\epsilon>0$ :

Since
$\frac{\hat A-A}{\sqrt{\sigma^2/N}}\sim \mathcal{N}(0,1)$

$\lim_{N\to \infty}\Pr\{|\hat A-A|>\epsilon\}=\lim_{N\to \infty}\Pr\left\{\left|\frac{\hat A -A}{\sqrt{\sigma^2/N}} \right|>\frac{\epsilon}{\sqrt{\sigma^2/N}}\right\}=0$

A probabilistic perspective of minimum variance:

Two unbiased estimators are proposed whose variances satisfy $\operatorname{var}(\hat \theta)<\operatorname{var}(\check \theta)$ . If both estimators are Gaussian, prove that
$\Pr \{|\hat \theta -\theta|>\epsilon\}<\Pr \{|\check \theta -\theta|>\epsilon\}$
for any $\epsilon$ . This says that the estimator with less variance is to be preferred since its PDF is more concentrated about the true value.

Since
$\frac{\hat \theta-\theta}{\sqrt{\operatorname{var}(\hat \theta)}}\sim \mathcal{N}(0,1),\quad \frac{\check \theta-\theta}{\sqrt{\operatorname{var}(\check \theta)}}\sim \mathcal{N}(0,1)$
Let the cumulative distribution function for $\mathcal{N}(0,1)$
$\Phi (x)=\int_{-\infty}^x \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}t^2}dt$
Then
$\Pr\{|\hat \theta -\theta|>\epsilon\}=\Pr\left\{\left|\frac{\hat \theta -\theta}{\sqrt{\operatorname{var}(\hat \theta)}} \right|>\frac{\epsilon}{\sqrt{\operatorname{var}(\hat \theta)}}\right\}=2\Phi\left\{\frac{-\epsilon}{\sqrt{\operatorname{var}(\hat \theta)}} \right\}$
If ${\operatorname{var}(\hat \theta)}<{\operatorname{var}(\check \theta)}$ ,
$\Phi\left\{\frac{-\epsilon}{\sqrt{\operatorname{var}(\hat \theta)}} \right\}<\Phi\left\{\frac{-\epsilon}{\sqrt{\operatorname{var}(\check \theta)}} \right\}$
or $\Pr \{|\hat \theta -\theta|>\epsilon\}<\Pr \{|\check \theta -\theta|>\epsilon\}$ .

What will happen if an unbiased estimator undergoes a nonlinear transformation? For instance, if we choose to estimate the unknown parameter $\theta=A^2$ by
$\hat \theta =\left( \frac{1}{N}\sum_{n=0}^{N-1}x[n]\right)^2,$
can we say that the estimator is unbiased? What happens as $N\to \infty$ ?

We know that
$\hat \theta={\hat A}^2\quad \hat A \sim \mathcal{N}(A,\sigma^2/N)$
Therefore,
$E(\hat \theta)=E(\hat {A}^2)=\operatorname{var}(\hat A)+E^2(\hat A)=\sigma^2/N+A^2=\sigma^2/N+\theta\ne \theta$
which is biased but asymptotically unbiased.

In our example, if the value of $\sigma^2$ is also unknown, an unbiased estimator is
$\hat {\boldsymbol{\theta}}=\left[\begin{matrix}\hat A\\\hat {\sigma}^2\end{matrix}\right]=\left[\begin{matrix}\frac{1}{N} \sum_{n=0}^{N-1} x[n]\\\frac{1}{N-1} \sum_{n=0}^{N-1} (x[n]-\hat A)^2\end{matrix}\right]$