机器学习中的训练集、验证集和测试集

最新推荐文章于 2024-04-15 15:59:18 发布

Dongdong Bai

最新推荐文章于 2024-04-15 15:59:18 发布

阅读量6k

点赞数 3

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/u011092188/article/details/68953723

版权

8 篇文章 1 订阅

订阅专栏

在机器学习中我们把数据分为测试数据和训练数据。

测试数据就是测试集，是用来测试已经训练好的模型的泛化能力。

训练数据常被划分为训练集（training set）和验证集（validation set），比如在K-折交叉验证中，整个训练数据集D，就被分为K个部分，每次挑选其中的（K-1）部分做训练集，剩下的部分为验证集。

训练集是用来训练模型或确定模型参数的，如ANN中权值，CNN中的权值等；验证集是用来做模型结构选择，确定模型中的一些超参数，比如正则项系数，CNN各个隐层神经元的个数等；

以下是维基百科中的解释：

Training set: A set of examples used for learning, which is to fit the parameters [i.e., weights] of the classifier.
Validation set: A set of examples used to tune the parameters [i.e., architecture, not weights] of a classifier, for example to choose the number of hidden units in a neural network.
Test set: A set of examples used only to assess the performance [generalization] of a fully specified classifier.

关注