论文笔记: Co-Forest (2007 年半监督协同训练经典论文)

最新推荐文章于 2023-05-30 08:30:59 发布

闵帆

最新推荐文章于 2023-05-30 08:30:59 发布

阅读量2.5k

点赞数 1

分类专栏：论文笔记文章标签：深度学习计算机视觉机器学习

本文链接：https://blog.csdn.net/minfanphd/article/details/122843692

版权

论文笔记专栏收录该内容

29 篇文章 3 订阅

订阅专栏

论文题目: Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples
论文优势: 几个概念在当时比较新.
论文劣势 (现在的观点): 数据集太小，方法比较简单.
可以作为我们工作的比较算法.

Why semi-supervised?
Unlabeled data can help improving the performance of the learner.
Why co-training?
Different views have their own pros and cons.
How to create different views?
Natural (image, text explaination)
Feature selection (100 -> 30, 25, 28)
What if a view is insufficient? Does not matter. It depends on your objective: theoritical support (ICML), or peformance only (ICDM)?
Why ensemble?
4.1 Why bagging? Simply construct a number of classifiers and vote.
4.2 Why boosting? Theoretical support. AdaBoosting.
Why CV data sampling?
Slightly change the data distribution. More importantly, removal some unwanted samples (outliers).

Basic ideas:

From one data to multiple data.
Serve for classifiers.
From one classifier to multiple classifiers.
2.1 Use different samples.
2.2 Use different views of the same samples.
How to integrate (ensemble) different classifiers?
3.1 During training. Select and label unlabeled samples for each other.
3.2 After prediction. Simple voting or weighted voting.

Sampling strategies:

Random sampling with replacement. Enough times will incur good results.
Cross validation. Partition to 10 parts, each time use 9 parts.

Confidence of an unlabeled instance.

If the classifier is an SVM, the distance to the classification hyperplane.
If the classifier is a decision tree, the purity of the leaf node.
If the classifier is kNN, the purity of neighors.

闵帆

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
论文笔记: Co-Forest (2007 年半监督协同训练经典论文)

论文题目: Improve Computer-Aided Diagnosis With Machine Learning Techniques Using Undiagnosed Samples论文优势: 几个概念在当时比较新.论文劣势 (现在的观点): 数据集太小，方法比较简单.可以作为我们工作的比较算法.Why semi-supervised?Unlabeled data can help improving the performance of the learner.Why co-tr
复制链接

扫一扫

专栏目录