Learning:Diagnostic models

1.Model evaluation

The test set is often used to evaluate how well the model is trained by the training set.

example:

 Split the training set and the test set:

The loss function of these set: note:The test set loss function does not require a regularization term.

You need to use the test set to evaluate the function fitted from the training set, and regularization is the function in the fitting process.

Take, for example, the classification problem:

 2.Model selection

Models are often selected using test sets.

 Now there is a way to fit all the parameters of the ten models and calculate the value of the cost function for all the fitted models.

But this approach is not only cumbersome, but the fitted model may be overly optimistic.

 This makes it easy to make generalization errors.

The solution is to divide the data into three sets, the training set, the cross-validation (交叉验证)set, and the test set.

*cross-validation set

 Again, the loss function for these datasets is calculated.

The fitting process is as follows: firstly, each model is trained by the training set, then the model with the smallest error is selected by the loss function of the validation set, and finally the final loss function value is calculated by using the test set. 

This makes it fair to evaluate the model using the test set .

Generalize to neural network model selection.

3.Bias and variance

example:

 Characteristics of loss function of training set and validation set.

With the increase of fitting order d, the value of the loss function of the training set decreases, but the value of the loss function of the validation set decreases first and then increases. 

 Based on the loss functions of the training set and the validation set, we can get a way to diagnose the model.

4.Regularization and model evaluation

After adding regularization, we continue with the same model selection operation as follows:

When λ increases, the degree of regularization is high, which leads to the underfitting of the training set and the cost of the validation set increases. When λ decreases, the degree of regularization is low, resulting in overfitting of the training set, and the cost of the training set is small but the cost of the validation set is still large.

 We need to choose a λ value that minimizes the variance of the validation set.

The image of the fitting order and the degree of regularization versus the loss function is mirrored.

*Speech recognition

Take speech recognition, for example:

Suppose that the error rate of the training set is 10.8% and that of the validation set is 14.8% in the given speech samples. This may seem like a high error rate, but in fact the reason for this is that there is a 10.2% error rate in human speech. This shows that the accuracy of the model we trained is still very high. 

 The error rate of the training set is close to that of the actual sample, which is reflected in low bias, and vice versa, which is reflected in high bias.

The error rate of the samples in the training set and the validation set is similar, which is reflected in low variance, and vice versa, reflected in high variance.

5.Learning curve

As the number of samples in the training set increases, the cost function increases because it is difficult to fit all the data. At the same time, the cost function of the validation set decreases because the model is more accurate.

High bias:

A larger value of the cost function appears.

High variance:

 Increasing the size of the training set will reduce the error.

In summary:

The answer to the previous question corresponds to the follow-up question 

*The superiority of neural networks

It is only necessary to deepen the number of layers of the neural network to solve the problem of too large deviation or too large variance

As long as you add regularization terms to larger neural networks, the results will be better than smaller ones. 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值