LML(Lifelong Machine Learning)介绍

最近做评论分析碰到一些瓶颈,阅读了众多论文,还是觉得刘兵老师靠谱,实用派的翘楚。于是了解了他近年来发的论文,发现了一个很新颖的词”lifelong”,而且还在KDD 2016做tutorial,题为《Lifelong Machine Learning》 [1]

何为Lifelong Machine Learning

传统的ML(他们叫做ML 1.0)都是单独地对一个任务进行学习,也就是给定一个数据集,运行一个ML算法,并没有考虑到以前学习过的知识,可以说是隔离(isolated)地学习。

ML 1.0的局限也很显而易见:
1. 学习的知识是不可积累的
2. 没有存储,即没有保留学习过的知识
3. 缺乏先验知识
4. 没有知识积累和自学习(self-learning):构建一个真正智能的系统是不可能的,因为不可想象每个任务都需要大量的标注工作。

而回头看我们人类是如何学习的:
1. 人类从不隔离式地学习
2. 我们能在过去知识的帮助下从少量样本中有效地学习(没有人给我1000个正向和1000个负向文档,来叫我人工构建一个分类器)
3. 当我们看到一个新例子的时候,大部分都是已知的,极少是未知的

由此提出LML:
Lifelong Machine Learning(LML)(还不知道中文怎么翻译)
* 像人类那样学习
* 从过去任务中保留知识,并用它帮助未来的学习

他们叫这个LML为ML 2.0。

LML例子

情感分析

情感分析很适合LML:
1. 跨领域/任务的大量知识共享
2. 情感表达(情感词),如good, bad, expensive, great
3. 情感对象(sentiment targets),如”The screen is great but the battery dies fast”

(1)情感分类 [3]

目标:将文档或句子分为+或-
困难:需要对每个领域人工标注大量的训练数据

我们可以不对每个领域进行标注或者至少不标注那么多吗?

利用过去的信息
大家都知道一个A领域的情感分类器不能用于B领域,那怎么办?

  • 经典的解决方法:迁移学习(transfer learning)
    • 利用源领域的标注数据来帮助目标领域的学习
    • 两个领域必须非常相似

但这可能不是最好的方法。

Lifelong情感分类(Chen, Ma and Liu 2015)
想象我们已经在大量的领域/任务用它们各自的训练数据D学习了知识。

那我们需要新领域T的数据吗?

  • 大多数情况不需要——一个幼稚的”LML”方法,就是引入所有数据来工作。
    • 能提高19%准确率
  • 其他情况需要:例如我们用D构建一个SC(sentiment classifier),但它对toy评论效果很差
    • 因为”toy”这个词(可能想表达领域太不相似)

(2)Lifelong Aspect抽取 [4] [5] [6]

“The battery life is long, but pictures are poor.”
它的aspect为:battery life, picture

观察:

  • 不同产品或领域的评论有着大量的aspect重叠
    • 每个产品评论领域都有aspect price
    • 大部分电子产品都有aspect battery
    • 很多产品都有aspect screen
  • 不用这些aspect显得很silly

其他LML应用

Lifelong machine learning跟传统ML一样,也分有监督、半监督、无监督和强化学习,后续会在博客中选择性更新。

Reference

[1] Zhiyuan Chen, Estevam Hruschka, and Bing Liu. Lifelong Machine Learning Tutorial. KDD-2016
[2] Daniel L. Silver and Robert Mercer. 1996. The parallel transfer of task knowledge using dynamic learning rates based on a measure of relatedness. Connection Science, 8(2), 277–294.
[3] Zhiyuan Chen, Nianzu Ma and Bing Liu. Lifelong Learning for Sentiment Classification. Proceedings of the 53st Annual Meeting of the Association for Computational Linguistics (ACL-2015, short paper), 26-31, July 2015, Beijing, China.
[4] Shuai Wang, Zhiyuan Chen, and Bing Liu. Mining Aspect-Specific Opinion using a Holistic Lifelong Topic Model. Proceedings of the International World Wide Web Conference (WWW-2016), April 11-15, 2016, Montreal, Canada.
[5] Qian Liu, Bing Liu, Yuanlin Zhang, Doo Soon Kim and Zhiqiang Gao. Improving Opinion Aspect Extraction using Semantic Similarity and Aspect Associations. Proceedings of Thirtieth AAAI Conference on Artificial Intelligence (AAAI-2016), February 12–17, 2016, Phoenix, Arizona, USA.
[6] Zhiyuan Chen, Arjun Mukherjee, and Bing Liu. 2014. Aspect Extraction with Automated Prior Knowledge Learning. In Proceedings of ACL, pages 347–358.
[7] Zhiyuan Chen and Bing Liu. 2014. Mining Topics in Documents : Standing on
the Shoulders of Big Data. In Proceedings of KDD, pages 1116–1125.

  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Lifelong Machine Learning (Synthesis Lectures on Artificial Intelligence and Machine Learning) By 作者: Zhiyuan Chen – Bing Liu ISBN-10 书号: 1681733021 ISBN-13 书号:: 9781681733029 Edition 版本: 2 Release Finelybook 出版日期: 2018-08-14 pages 页数: (207 ) $79.95 Book Description to Finelybook sorting Synthesis Lectures on Artificial Intelligence and Machine Learning Lifelong Machine Learning, Second Edition 版本 is an introduction to an advanced machine learning paradigm that continuously learns by accumulating past knowledge that it then uses in future learning and problem solving. In contrast, the current dominant machine learning paradigm learns in isolation: given a training dataset, it runs a machine learning algorithm on the dataset to produce a model that is then used in its intended application. It makes no attempt to retain the learned knowledge and use it in subsequent learning. Unlike this isolated system, humans learn effectively with only a few examples precisely because our learning is very knowledge-driven: the knowledge learned in the past helps us learn new things with little data or effort. Lifelong learning aims to emulate this capability, because without it, an AI system cannot be considered truly intelligent. Research in lifelong learning has developed significantly in the relatively short time since the first edition of this book was published. The purpose of this second edition is to expand the definition of lifelong learning, update the content of several chapters, and add a new chapter about continual learning in deep neural networks—which has been actively researched over the past two or three years. A few chapters have also been reorganized to make each of them more coherent for the reader. Moreover, the authors want to propose a unified framework for the research area. Currently, there are several research topics in machine learning that are closely related to lifelong learning—most notably, multi-task learning, transfer learning, and meta-learning—because they also employ the idea of knowledge sharing and transfer. This book brings all these topics under one roof and discusses their similarities and differences. Its goal is to introduce this emerging machine learning paradigm and present a comprehensive survey and review of the important research results and latest ideas in the area. This book is thus suitable for students, researchers, and practitioners who are interested in machine learning, data mining, natural language processing, or pattern recognition. Lecturers can readily use the book for courses in any of these related fields. Preface Acknowledgments Introduction Related Learning Paradigms Lifelong Supervised Learning Continual Learning and Catastrophic Forgetting Open-World Learning Lifelong Topic Modeling Lifelong Information Extraction Continuous Knowledge Learning in Chatbots Lifelong Reinforcement Learning Conclusion and Future Directions Bibliography Authors’Biographies Blank Page

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值