Machine learning- cross validation

最新推荐文章于 2024-08-06 20:17:03 发布

susu77477

最新推荐文章于 2024-08-06 20:17:03 发布

阅读量155

点赞数

分类专栏：机器学习文章标签：机器学习算法

本文链接：https://blog.csdn.net/susu77477/article/details/113796849

版权

机器学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Machine learning- cross validation

Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into. As such, the procedure is often called k-fold cross-validation.

1.用来比较不同的machine learning methods，评测methods的准确率。Cross validation is used to estimate performance of model if model get tested on real test data.
2.例如对于一个包含性别，年龄，体重，体检指标的病人数据集

可以用logistics regression或k-nearest neighbour或SVM来建立模型.
得到模型参数parameter的过程称之为training the algorithm
检测模型的有效性的过程称之为test the algorithm

如何有效的检测

把整个数据集分为训练集training data和检测集test data
如果把整个数据分为四段ABCD,且数据的75%分为训练集，25%分为检测集
ABC 训练集 D 检测集
ABD 训练集 C 检测集
ACD 训练集 B 检测集
BCD 训练集 A 检测集
一般而言，只需要一种方法检验模型的有效性，得到一个结果a（23，true;10 false); 如果四种数据集配对方法都用到，得到四个结果abcd，将其相加，会更到一个更为准确的模型检测结果P.
比较不同算法的结果P，可以直观看出模型的优劣。
4.将数据集分为四分，称之为Four-Fold Cross Validation; 如果把单个数据column作为单位，每次只将单个数据作为检测集，其余作为训练集，则是“Leave One Out Cross Validation"; 类似的，整个数据集分为十份，”Ten-Fold Cross Validation";
5.对于类似Ridge Regression 的method,tunning parameter（a parameter doesn’t estimated, but just sort of huessed),可以用Ten-Fold Cross Validation检测。

参考：
https://www.youtube.com/watch?v=fSytzGwwBVw

susu77477

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
2
评论
Machine learning- cross validation

Machine learning- cross validationCross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is
复制链接

扫一扫