无偏估计和最小方差无偏估计简介

最新推荐文章于 2023-10-13 23:12:19 发布

Turbo-shengsong

最新推荐文章于 2023-10-13 23:12:19 发布

阅读量2.7k

点赞数 1

分类专栏：数学基础信息与通信文章标签：概率论

本文链接：https://blog.csdn.net/weixin_43413559/article/details/126042228

版权

信息与通信同时被 2 个专栏收录

22 篇文章

订阅专栏

数学基础

20 篇文章

订阅专栏

无偏估计和最小方差无偏估计

无偏估计：Unbiased Estimation
最小方差无偏估计: Minimum Variance Unbiased Estimation (MVU)

前言

在正式开始介绍之前，我们需要熟悉一些基本概念。

（1） 什么是参数估计？
站在数学角度，我们有一个数据集合 $\{x[0],x[1],\cdots x[N-1]\}$ ，包含 $N$ 点数据，这 $N$ 个点的数据依赖于参数 $\theta$ 。我们希望能够通过这 $N$ 点数据来估计出 $\theta$ ，或者用数学语言描述为：定义一个估计器(estimator)
$\hat{ \theta} = g \left( x[0] ,x[1],\cdots x[N-1] \right) \tag{1}$

其中 $g$ 是一个函数，因此估计器其实就是一个函数。这便是参数估计（parameter estimation）问题的本质。

强调：估计器(estimator) $\hat{\theta}$ 是一个随机变量。这相对比较容易理解，首先数据本身是随机的(data are inherently random)，从式(1)可以看出， $\hat{\theta}$ 是多个随机变量经过一个固定映射关系得到的，因此 $\hat{\theta}$ 本身也是一个随机变量。( $\Leftarrow$ The estimate of $\theta$ is the value of $\theta$ obtained for a given realization of $\boldsymbol{x}$ )。另外，要区分 $\hat{\theta}$ 和 $\theta$ ， $\hat{\theta}$ 一定是随机变量，但是是否把 $\theta$ 看作随机变量则将估计问题划分为两种类型：

（2） 数学意义上，我们如何整体性地看待参数估计问题？
整体意义上去理解，我们可以参数估计问题，分解为以下两个大步骤：
Step-1: 首先要做的就是对数据进行建模(model the data)，因为数据固有的随机性，我们使用概率密度函数(PDF)来描述数据这种的随机性质，写为 $p\left( x[0] ,x[1],\cdots x[N-1]; \theta \right)$ 。我们将这个概率密度函数解释为：The PDF is parameterized by the unknown parameter $\theta$ , i.e., we have a class(family) of PDFs where each one is different due to a different value of $\theta$ .

例: 如果 $\ \theta$ 表示均值，那么描述数据的PDF可能是
$\theta) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left ( - \frac{1}{2 \sigma^2} (x[0] - \theta)^2 \right) \tag{2}$

在实际问题当中，我们可能不会给一个确定的PDF，这时我们必须要选择一个不仅与问题约束契合，而且数学上方便展开计算的PDF进行建模。因为任何估计器的性能都强烈依赖于PDF的假设。

一般地，我们将估计器分为两类
$\begin{cases} \text{classical estimation: the parameters of interest are assumed to be deterministic but unknown} \\ \text{Baysian estimation: the parameter we are attempting to estimate is viewed as a *realization* of the random variable $\theta$} \end{cases}$

为了与式(2)区分，在贝叶斯估计中，我们用联合PDF(joint PDF)来描述数据
$p(\boldsymbol{x}, \theta) = p(\boldsymbol{x}| \theta) p(\theta) \tag{3}$

其中 $p(\theta)$ 是先验概率。从先验概率可以看出，两种估计方式的区别就在于是否把参数 $\theta$ 看作是随机变量，如果是随机变量，那么就有先验概率。

另外，我们还要能够区分两种PDF： $p(\boldsymbol{x}; \theta)$ 和 $p(\boldsymbol{x}|\theta)$
$\begin{cases} p(\boldsymbol{x}; \theta): \text{a family of PDFs} \\ p(\boldsymbol{x}|\theta): \text{a conditional PDF} \end{cases}$

Step-2: 一旦确定好PDF，问题就转变成，我们基于该PDF来确定一个如式(1)所示的估计器。补充：估计器也能将参数作为自变量，但要求该参数是已知的。

（3） Important Points
An estimator is a random variable. As such, its performance can only be completely described statistically or by its PDF.

无偏估计

我们主要关注对未知确定(unknown but deterministic)参数的估计。

无偏估计的定义
对于任意未知但确定的参数 $\theta$ ，如果估计器 $\hat{\theta}$ 满足：
$\mathbb{E}_{p(\boldsymbol{x}; \theta)} (\hat{\theta}) = \theta \ \ \ \ \forall \theta \tag{4}$

其中估计器 $\hat{\theta}=g \left( x[0] ,x[1],\cdots x[N-1] \right)$ 。

更具体地写为：
$\begin{aligned} \mathbb{E}_{p(\boldsymbol{x}; \theta)} (\hat{\theta}) &= \int \hat{\theta} p(\boldsymbol{x}; \theta) d \boldsymbol{x} \\ &= \int g \left(\boldsymbol{x} \right) p(\boldsymbol{x}; \theta) d \boldsymbol{x} \\ &= \hat{\theta}, \ \ \forall \theta \end{aligned} \tag{5}$

如果一个估计器是有偏估计，我们用下式对其进行描述
$\mathbb{E}(\hat{\theta}) = \theta + b (\theta) \tag{6}$

其中 $(\theta) = \mathbb{E}(\hat{\theta}) - \theta$ 被称为估计器的偏置(bias of the estimator)

最小方差准则

在寻找最优估计器的时候，我们经常会采用一些最优性准则，其中一种很自然的准则就是最小均方误差(MSE: Mean Square Error)，定义为：
$\text{mse} (\hat{\theta}) = \mathbb{E} \left [ (\hat{ \theta} - \theta)^2 \right ] \tag{7}$

但不幸的是，采用这种自然的MSE准则会导致估计器无法实现，因为估计器不能仅仅使用数据来表征。为了理解这个问题，我们将MSE写为
$\begin{aligned} \text{mse} (\hat{\theta}) &= \mathbb{ E} \left \{ \left[ (\hat{\theta} - \mathbb{E}[\hat{\theta}]) + ( \mathbb{E}[\hat{\theta} - \theta)] \right]^2 \right \} \\ &= \text{var}[\hat{\theta}] + \left [\mathbb{E}[\hat{\theta}] - \theta \right ]^2 \\ &= \text{var} [\hat{\theta}] + b^2 (\theta) \end{aligned} \tag{8}$

式(8)说明了MSE包含了估计器产生的方差，以及偏执(bias)。因此如果我们依据最小MSE准则来设计估计器，那么等价于最小化 $\text{var} [\hat{\theta}] + b^2 (\theta)$ ，这牵涉到了所要估计的参数 $\theta$ ，所以是不可实现的。

换一个角度来考虑，如果我们要求估计器是无偏的，那么这时最小化MSE就等价于最小化方差。这样的估计器，我们称之为：最小方差无偏估计器( MVU )。

寻找最小方差无偏估计器

事实上，即使MVU估计器存在，我们也不一定能找到它。我们可能可以通过以下三种方式来寻找MVU估计器

1.Determine the Cramer-Rao lower bound (CRLB) and check to see if some estimator satisfies it.
2.Apply the Rap_Blackwell-Lehmann-Scheffe (RBLS) theorem.
3.Further restrict the class of estimators to be not only unbiased but also linear. Then, find the minimum variance estimator within this restricted calss.

Approach 1 and 2 may produce the MVU estimator, while 3 will yield it only if the MVU estimator is linear in the data.