self-training and co-training

最新推荐文章于 2024-01-22 15:12:15 发布

Calm__down

最新推荐文章于 2024-01-22 15:12:15 发布

阅读量3.5k

点赞数

分类专栏： Data mining/Machine learning 文章标签： training features translation parsing processing wrapper

5 篇文章 0 订阅

订阅专栏

Semi-supervised learning methods widely used include:

1.EM with generative mixture models

2.self-training

3.co-training

4.transductive support vector machines

5.graph-based methods

self-training:

A classifier is first traind with the small amount of labeled data. The classifier is then used to classify the unlabeled

data. Typically the most confident unlabeled data points, together with their predicted labels, are added to the

training set. The classifier is re-trained and the procedure repeated.

When the existing supervised classifier is complicated and hard to modify, self-training is a practical wrapper method.

applied to several natural language processing tasks, word sense disambiguation, parsing, machine translation and

object detection system from images.

co-training

Co-training assumes that features can be split into two sets. Each sub-features is sufficient to train a good classifier.

The two sets sre conditionally independent given the class. Initially two seperate classifiers are trained with the

labeled data, on the two sub-features sets respectively. Each classifier then classifies the unlabeled data, and

'teaches' the other classifier with the few unlabeled examples(and the predicted labels) they feel most confident.

Each classifier is retrained with the additional training examples given by the other classifer, and the process

repeats.

When the features naturally split into two sets, co-training may be appropriate.

Reference:

Xiaojin Zhu. Semi-Supervised Learning with Graphs.

关注

专栏目录