实用机器学习笔记（二）：数据标注

最新推荐文章于 2023-08-22 09:16:22 发布

留小星

最新推荐文章于 2023-08-22 09:16:22 发布

阅读量945

点赞数

分类专栏：实用机器学习-李沐文章标签：机器学习算法注意力机制

本文链接：https://blog.csdn.net/jerry_liufeng/article/details/123350092

版权

9 篇文章 6 订阅

订阅专栏

simplify user interaction: design easy tasks ,clear instructions and simple to use interface (设计简单、清楚的标注任务)
- THE user instruction and task used by the MIT place365 dataset
cost:active learning + self-training （考虑标注的成本问题）
- focus on same scenario as SSL but with human intervention （有人工干预的SSL）
- uncertainty sampling chooses an example whose prediction is most uncertain （抽样筛查标签）
- similar to self-training we can use expensive models
  - query by committee trains multiple models and performs major voting
quality control :label qualities generated by different labels vary（控制标注信息的质量）
- simplest but most expensive : sending the same task to multiple labeled then determinne the label by majority voting （最简单却也是最贵的方法就是，将同一个任务交给不同的人去做，然后进行投票）

关注