半监督算法:(Using Weighted Nearest Neighbor to Benefit from Unlabeled Data)

今天看了一篇关于半监督算法的论文:Using Weighted Nearest Neighbor to Benefit from Unlabeled Data

对整篇论文做了一些总结:

 

 

一.简介
1.半监督算法的必要性:where often the unlabeled examples greatly outnumber the labeled examples 标签好的类往往大大少于未标签的类,因此我们可以考虑从未标签的类当中提取一些可供参考的信息来提高分类器的准确率。
2.半监督算法的大致流程:The examples from the unlabeled set are "pre-labeled" by an initial classifer that is build using the limited
available training data. By choosing appropriate weights for this prelabeled data, the nearest neighbor classifer consistently improves on the original classifer.首先用有标签的类去训练分类器,然后用这个初始分类器去预测未标签的类。然后给未分类数据选择合适的权重,用最近邻居分类器去提高初始分类器的准确率。
 
3.the key to semi-supervised learning is the prior assumption of consistency, that allows for exploiting the geometric structure of the data distribution.
半监督算法的关键是前提假设的一致性,这样就可以发现数据的几何分布。
 
4.Close data points should belong to the same class and decision boundaries should lie in regions of low data density; this is also called the "cluster assumption".
距离相互靠近的点应该同属于同一类,决策边界应该落在数据低密度的区域,即“假设聚类”。
 
5.该论文提出的半监督算法流程:In this paper, we introduce a very simple two-stage approach that uses the available unlabeled data to improve on the predictions made when learning only from the labeled examples. In a first stage, it uses an of-the-shelf classifier to
build a model based on the small amount of  available training data, and in the second stage it uses that m
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值