face recogition 人脸识别 training validation testing 集 --- 史上最全总结

最新推荐文章于 2020-08-28 20:01:22 发布

置顶 Mute杭盖

最新推荐文章于 2020-08-28 20:01:22 发布

阅读量439

点赞数

分类专栏：人脸识别文章标签：深度学习神经网络机器学习 pytorch tensorflow

本文链接：https://blog.csdn.net/HeavenerWen/article/details/106105992

版权

人脸识别专栏收录该内容

13 篇文章 0 订阅

订阅专栏

为什么要有这么个概念和区别？

注意：validation可以被直接用作测试，不是单纯字面理解那么简单

通常在训练有监督的机器学习模型时候，会将数据划分为training dataset, validation dataset, testing dataset. 然而，在学习中，常常会混淆“validation dataset”和“testing dataset”.
常见分割比例为8：1：1

注意这个分成三份的逻辑，本来是只有training和test, 但是If you run the same algorithm (train on training, evaluate on test) for multiple sets of hyperparameters and chose the model with the best “performance” on the test set you risk overfitting this test set. 意思是如果只有training和test，那么对不同hyper-parameter下的模型是利用test来选出来的，但是因为是在test上选出的，虽然效果可能很好，但是存在很大的容易过拟合的可能，而且对unseen的数据是不是能泛化也很不确定。所以，这里分成training, validation, test的逻辑是从training里再分出一个，也就是the training set is split once more into:

actual training
validation - another subset of the training set that is used to evaluate the model performance for each run/set of hyperparameter values. 所以，这样的话，我们知道了，超参数决定validation表现，看validation表现来反过来调整超参数.

To summarise, train your system using training set, optimise the system (everything, including feature extraction and training procedure) by checking performance on the validation set. Finally, when you are done, report your system’s performance on an independent test set.
As a final note, if you do not do any tuning by checking validation performance, then you do not need a test set. You can use your validation set to directly report the performance. 这就是为啥不要把validation理解为validate这么狭隘的字面意思，validation set是有机会变成testing set的.

通常会给定training dataset和testing dataset, 而不会给validation dataset

Training Dataset

一句话总结：训练集就是用来训练模型，更具体就是用来确定模型的可学习的参数，就像CNN的权重和偏置这些。

Quora Training Validation Test Difference
Python API Module for splitting dataset into Training, Validation, Testing

根据Quora的Mehmet Ufuk Dalmis的回答，用training dataset来训练一个system, 但是如果还在这个training dataset上一个perfomance measure就不能reflect the performance in a real world scenario. You need a validation set to really validate your system without any doubt of overfitting. 那测试集不也同样能really test your system without any doubt of overfitting.

Validation Dataset

验证集用于确定训练模型所用的超参数，都包括什么超参数呢？例如：迭代次数，学习率。以K-NN为例， K值(想找到几个和它最接近的)就是一个超参数。具体得讲就是验证集并不确定CNN的权重和偏置，因为验证集合并没有参与梯度下降的过程。

看到这里，这就知道了啊，我到底该用什么超参数，不是光follow论文的，也不是胡乱试试看的，而是应该通过观看在验证集上的表现来看。

二级标题：划分Validation Dataset的方式

在 Davidsandberg facenet.py的工作中，分为如下两种方式：

SPLIT_IMAGES
SPLIT_CLASSES

三级标题：SPLIT_IMAGES

经过对程序的研读后，这可以看作进行如下所示的结构划分：

把奥迪A4的前75%image留给training,后25%拿给validation
把宝马740的前75%images留给training,后25%拿给validation.

总之，in this case, training set里有的identity，validation set里都有. 只不过在training set里的identity和在validation set里的同样的identity, 他们下面的图像互不overlap.

Testing Dataset

Mehmet Ufuk Dalmis对于为啥还要有Testing Dataset的解释：

Imagine that you trained your system, you measured the performance in a validation set. But you didn’t like the performance. So you tuned your system (hyper parameters) to get a better performance. You measured the performance in your validation set again to check if it improved or not. You repeat this again and again, in this way you reach an optimum system for your task. You have a good performance in validation set now. Well done!

按照我的理解，不同hyper parameter对应于不同的模型，因为一种超参数设置对应一种模型，所以用验证集来选出最好超参数下的那个模型。

But there is a problem. You did all these tunings and optimisations based on the performance in your validation set. May be these tunings work well for your validation set, but is this necessarily generalisable? Maybe some worked just by chance? 那么现在，尽管对于在validation上效果好了，但是，也许这些调整对于您的验证集效果很好，但这是否可以泛化？也许有些是偶然的.

Therefore, after all these optimisations, you need to report the performance of your final system using a completely unseen test set.

唯一真正完全unseen的集合就是test set了，因为模型是在训练集上训的，是在validation上调到最好的，

Because those optimisations are also a part of your training and when you optimize your system based on performance on the validation set, you are actually using the validation set like a training set. 意思，在validation上优化实际上也是训练模型的一部分。