Several theoretical efforts have analyzed the process of learning by minimizing the error on a training set (a process sometimes called Empirical Risk Minimization,经验风险最小化)
Some of those theoretical analyses are based on decomposing the generalization
error into two terms: bias and variance. The bias is a measure of how much the network output, averaged over all possible datasets differs from the desired function. The variance is a measure of how much the network output varies between datasets. Early in training, the bias is large because the network output is far from the desired function. The variance is very small because the data has had little influence yet. Late in training, the bias is small because the network has learned the underlying function. However, if trained too long, the network will also have learned the noise specific to that dataset. This is referred to as overtraining. In such a case, the variance will be large because the noise varies between datasets. It can be shown that the minimum total error will occur when the sum of bias and variance are minimal.
《tricks of trade》读书笔记 bias和variance
最新推荐文章于 2022-11-18 17:16:37 发布