Machine Learning-bias & variance

From the machine learning held by Andrew Ng in Coursera, I have learned these two concepts. When you are training data and find results not so good, you want to find where the problem comes out. He has mentioned if both training error and cross validation error are high, you probably have the bias problem, which is underfitting. If your training error is low but cross validation error is far higher than training error, you probably have the variance problem, which is overfitting.

When I first learned this, I just memorized the two concepts without understanding why the first situation corresponds to "bias" and the second situation corresponds to "variance".

Yesterday I learned it from deep learning class.

When we are training data from a model, it is same to estimate value of parameters  from given data. The parameters have ground truth value but we may not recover them from data. If the expectation of estimate values (with respect to random choice of data) is ground truth value, then the model is suitable. So we call the difference between the two is bias.


If bias is large as data size goes up, then there is a gap between estimation and ground truth. So the model cannot explain data well no matter how many data we will use. This means the concavity of model cannot cover data. So we have to increase concavity, or complexity of model. That's the problem of under-fitting.

The "variance" is the variance of estimated parameters.


Variance typically decreases as the size of train data set increases. Because the model has a degree of complexity, and more data will give more control of the complexity. For example, if data comes from a 5-th degree polynomial and you train with 10-th degree polynomial model, then using small data set may give you improper results, like a real 10-th degree polynomial model. But using large data set will more probably give you a 5-th degree polynomial. High variance corresponds to overfitting problem, because estimation varies a lot between different data, that means we fit each data too much. To fix the problem of overfitting, we need to make size of data and model complexity compatible. Using more data and decreasing feature numbers are two good ways.

The sum of bias and variance is estimation error, which describes the difference between estimated value and ground truth value.


The expectation is taken over all possible data. So if we fix data size, then the left side is constant, resulting right side is fixed. Therefore, if bias increases, variance will decrease. That's why if we increase model complexity, we will fix high bias problem but may result in high variance. Usually data scientists are trying to find a balance from this equation.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值