Why do we call test-set as a way for unbiased estimation?（为何我们说测试集是用来做无偏估计的？）

最新推荐文章于 2024-08-08 17:45:51 发布

JasonDean

最新推荐文章于 2024-08-08 17:45:51 发布

阅读量321

点赞数

分类专栏：深度学习统计学文章标签：深度学习

本文链接：https://blog.csdn.net/JasonDean/article/details/124089101

版权

统计学同时被 2 个专栏收录

3 篇文章 0 订阅

订阅专栏

深度学习

2 篇文章 0 订阅

订阅专栏

Why do we call test-set as a way for unbiased estimation?

#UnbiasedEstmator #DeepLearning #Test-set
For Deep Learning applications, we spilt the dataset into training / validation and test sets. The test set is used for unbiased estimation of the model skills (generalization errors).

A natural question then is whether or not these estimators are “good” in any sense. One measure of “good” is “unbiasedness.”

Cases in time series forecasting with DL models with MAE as the estimator

Assume the MAE (Mean Absolute Error) estimator is
$\begin{aligned} \widehat{\theta} &= \mathbb{L}_{MAE}(X_1,X_2,\cdots,X_n) \\ &= \frac{1}{n}\sum^{n}_{i=1}{|X_i-\widehat{X_i}|} \end{aligned}$
where $(X_1,X_2,\cdots)$ is the test set, and the widehat version is its estimations. If its mathematical expectations are:
$E(\widehat{\theta}) = \widehat{\theta}$

Then, we call $\widehat{\theta}$ is the unbiased estimator of MAE.

To prove that MAE is a unbiased estimator:

$\begin{aligned} E(\widehat{\theta}) &= E(\frac{1}{n}\sum^{n}_{i=1}{|X_i-\widehat{X_i}|}) \\ & = \frac{1}{n}\sum^{n}_{i=1}{E(|X_i-\widehat{X_i}|)}\\ & = \frac{1}{n}\times n \times \widehat{\theta} \\ & = \widehat{\theta} \end{aligned}$

Cases with MSE

MSE is also a unbiased estimator.

$\begin{aligned} \widehat{\theta} &= \mathbb{L}_{MSE}(X_1,X_2,\cdots,X_n) \\ &= \frac{1}{n}\sum^{n}_{i=1}{(X_i-\widehat{X_i})^2} \end{aligned}$
$\begin{aligned} E(\widehat{\theta})&= E(\frac{1}{n}\sum^{n}_{i=1}(X_i-\widehat{X_i})^2 )\\ &= \frac{1}{n} E(\sum^{n}_{i=1}(X_i-\widehat{X_i})^2) \\ &= \frac{1}{n} \sum^{n}_{i=1}E((X_i-\widehat{X_i})^2) \\ & = \widehat{\theta} \end{aligned}$

Conclusion

To say that test set is used for unbiased estimation, it is not about the dataset itself, but about the estimator and estimated object, i.e., mean value and $\mu$ , respectively.

JasonDean

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
2
评论
Why do we call test-set as a way for unbiased estimation?（为何我们说测试集是用来做无偏估计的？）

Why do we call test-set as a way for unbiased estimation?#UnbiasedEstmator #DeepLearning #Test-setFor Deep Learning applications, we spilt the dataset into training / validation and test sets. The test set is used for unbiased estimation of the model ski
复制链接

扫一扫