Learning：Diagnostic models

Blissmaker

已于 2023-11-28 19:46:53 修改

阅读量50

点赞数

分类专栏： machine learning algorithm 文章标签：人工智能

于 2023-11-21 21:33:30 首次发布

本文链接：https://blog.csdn.net/linglingdie/article/details/134518420

版权

machine learning algorithm 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

1.Model evaluation

The test set is often used to evaluate how well the model is trained by the training set.

example：

Split the training set and the test set：

The loss function of these set： note：The test set loss function does not require a regularization term.

You need to use the test set to evaluate the function fitted from the training set, and regularization is the function in the fitting process.

Take, for example, the classification problem：

2.Model selection

Models are often selected using test sets.

Now there is a way to fit all the parameters of the ten models and calculate the value of the cost function for all the fitted models.

But this approach is not only cumbersome, but the fitted model may be overly optimistic.

This makes it easy to make generalization errors.

The solution is to divide the data into three sets, the training set, the cross-validation （交叉验证）set, and the test set.

*cross-validation set

Again, the loss function for these datasets is calculated.

The fitting process is as follows: firstly, each model is trained by the training set, then the model with the smallest error is selected by the loss function of the validation set, and finally the final loss function value is calculated by using the test set.

This makes it fair to evaluate the model using the test set .

Generalize to neural network model selection.

3.Bias and variance

example：

Characteristics of loss function of training set and validation set.

With the increase of fitting order d, the value of the loss function of the training set decreases, but the value of the loss function of the validation set decreases first and then increases.

Based on the loss functions of the training set and the validation set, we can get a way to diagnose the model.

4.Regularization and model evaluation

After adding regularization, we continue with the same model selection operation as follows:

When λ increases, the degree of regularization is high, which leads to the underfitting of the training set and the cost of the validation set increases. When λ decreases, the degree of regularization is low, resulting in overfitting of the training set, and the cost of the training set is small but the cost of the validation set is still large.

We need to choose a λ value that minimizes the variance of the validation set.

The image of the fitting order and the degree of regularization versus the loss function is mirrored.

*Speech recognition

Take speech recognition, for example：

Suppose that the error rate of the training set is 10.8% and that of the validation set is 14.8% in the given speech samples. This may seem like a high error rate, but in fact the reason for this is that there is a 10.2% error rate in human speech. This shows that the accuracy of the model we trained is still very high.

The error rate of the training set is close to that of the actual sample, which is reflected in low bias, and vice versa, which is reflected in high bias.

The error rate of the samples in the training set and the validation set is similar, which is reflected in low variance, and vice versa, reflected in high variance.

5.Learning curve

As the number of samples in the training set increases, the cost function increases because it is difficult to fit all the data. At the same time, the cost function of the validation set decreases because the model is more accurate.

High bias：