bias & variance 以及 Mean squared error

最新推荐文章于 2024-01-15 11:43:12 发布

IT_Vitamin

最新推荐文章于 2024-01-15 11:43:12 发布

阅读量4.8k

点赞数 1

分类专栏： machine-learning 文章标签： machine-learning

machine-learning 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

标签： machine_learning

bias & variance

Estimator（估计量）: a function of the data that is used to infer the value of an unknown parameter in a statistical model,can be writed like $\hat{\theta}(X)$ .”估计量”是样本空间映射到样本估计值的一个函数 (Then an “estimator” is a function that maps the sample space to a set of sample estimates.)估计量用来估计未知总体的参数，它有时也被称为估计子；一次估计是指把这个函数应用在一组已知的数据集上，求函数的结果。对于给定的参数，可以有许多不同的估计量。
Estimand:The parameter being estimated,like $\theta$ .
Estimate: a particular realization of this random variable $\hat{\theta}(X)$ is called the “estimate”,like $\hat{\theta}(x)$ .
Bias: The bias of $\widehat{\theta}$ is defined as $B(\widehat{\theta})$ = $\operatorname{E}(\widehat{\theta}) - \theta$ . It is the distance between the average of the collection of estimates, and the single parameter being estimated. It also is the expected value of the error, since $\operatorname{E}(\widehat{\theta}) - \theta = \operatorname{E}(\widehat{\theta} - \theta )$ .The estimator $\widehat{\theta}$ is an unbiased estimator of $\theta$ if and only if $B(\widehat{\theta}) = 0$ .example: If the parameter is the bull’s-eye of a target, and the arrows are estimates, then a relatively high absolute value for the bias means the average position of the arrows is off-target, and a relatively low absolute bias means the average position of the arrows is on target. They may be dispersed, or may be clustered.
Variance(方差):The variance of $\widehat{\theta}$ is simply the expected value of the squared sampling deviations; that is, $\operatorname{var}(\widehat{\theta}) = \operatorname{E}[(\widehat{\theta} - \operatorname{E}(\widehat{\theta}) )^2]$ . It is used to indicate how far, on average, the collection of estimates are from the expected value(期望) of the estimates.example :If the parameter is the bull’s-eye of a target, and the arrows are estimates, then a relatively high variance means the arrows are dispersed, and a relatively low variance means the arrows are clustered. Some things to note: even if the variance is low, the cluster of arrows may still be far off-target, and even if the variance is high, the diffuse collection of arrows may still be unbiased. Finally, note that even if all arrows grossly miss the target, if they nevertheless all hit the same point, the variance is zero.
The relationship between bias and variance is analogous to the relationship between accuracy and precision.
从以上描述可以看出，bias表示预测值的均值与实际值的差值；而variance表示预测结果作为一个随机变量时的方差，其描述中类比靶心的例子较为明了。http://blog.csdn.net/ywl22/article/details/8606166。
bias & variance
Bias、variance与复杂度的关系
这里写图片描述

note:the sample mean

X ¯ ¯ ¯ = 1 N \sum i = 1 N X i

${\overline{X}}=\frac{1}{N}\sum_{i=1}^{N}{X}_i$ is an unbiased estimator of

μ $μ$ ,and the sample variance

s 2 = 1 n - 1 \sum i = 1 n (X i - X ¯ ¯ ¯) 2

$s^2=\frac{1}{n-1}\sum_{i=1}^n\,(X_i-\overline{X}\,)^2$ is an unbiased estimator of

σ2 $σ^2$ (not the

S 2 = 1 n \sum i = 1 n (X i - X ¯ ¯ ¯) 2

$S^2=\frac{1}{n}\sum_{i=1}^n\,(X_i-\overline{X}\,)^2$ that is a biased estimator of σ^2,proof is here)
样本均值是总体均值的无偏估计，而样本方差却不是总体方差的无偏估计，其小于总体方差。

Mean squared error

In statistics, the mean squared error (MSE) of an estimator measures the average of the squares of the “errors”, that is, the difference between the estimator and what is estimated.MSE is a risk function, corresponding to the expected value of the squared error loss or quadratic loss.(损失函数or代价函数？)
$\operatorname{MSE}(\hat{\theta})=\operatorname{Var}(\hat{\theta})+ \left(\operatorname{Bias}(\hat{\theta},\theta)\right)^2$
$=\operatorname{E}[(\widehat{\theta} - \operatorname{E}(\widehat{\theta}) )^2]+ {\left( \operatorname{E}(\widehat{\theta}) - \theta\right)}^2$
proof
ps:
In statistics, the bias (or bias function) of an estimator is the difference between this estimator’s expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.
这里写图片描述