▼金猿奖·2019年度征集评选▼
![640?wx_fmt=png](https://i-blog.csdnimg.cn/blog_migrate/a83b80a9b4f080d5584390d2051edd42.png)
大数据产业创新服务媒体
——聚焦数据 · 改变商业
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/31251785a10c71d8b5afb91dddcb48d9.jpeg)
作者:韩旭、高天宇、刘知远
本文约8750字,建议阅读10分钟。
本文作者总结了对实体关系抽取现状、挑战和未来发展方向的认识。
知识图谱
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/54c3f1ead060793e2281f369bebe5938.jpeg)
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/61e380531144614ee77f528086e5de81.jpeg)
神经网络关系抽取模型
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/720002971e2dde78fc309d36f97ce336.jpeg)
-
数据规模问题:人工精准地标注句子级别的数据代价十分高昂,需要耗费大量的时间和人力。在实际场景中,面向数以千计的关系、数以千万计的实体对、以及数以亿计的句子,依靠人工标注训练数据几乎是不可能完成的任务。
-
学习能力问题:在实际情况下,实体间关系和实体对的出现频率往往服从长尾分布,存在大量的样例较少的关系或实体对。神经网络模型的效果需要依赖大规模标注数据来保证,存在”举十反一“的问题。如何提高深度模型的学习能力,实现”举一反三“,是关系抽取需要解决的问题。
-
复杂语境问题:现有模型主要从单个句子中抽取实体间关系,要求句子必须同时包含两个实体。实际上,大量的实体间关系往往表现在一篇文档的多个句子中,甚至在多个文档中。如何在更复杂的语境下进行关系抽取,也是关系抽取面临的问题。
-
开放关系问题:现有任务设定一般假设有预先定义好的封闭关系集合,将任务转换为关系分类问题。这样的话,文本中蕴含的实体间的新型关系无法被有效获取。如何利用深度学习模型自动发现实体间的新型关系,实现开放关系抽取,仍然是一个”开放“问题。
更大规模的训练数据
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/4d0314dc51f6c77e0c5c2a7387602be3.jpeg)
更高效的学习能力
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/48477eb8d577543245b222dfc1c754f9.jpeg)
更复杂的文本语境
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/71e73a859aaf0e8696e1f3ba0ecfaba8.jpeg)
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/0dd7073c79f1c0ade5e2e7be8d4c6bfa.jpeg)
更开放的关系类型
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/55b7e2b00034887592a85232e1d9b4a4.jpeg)
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/89bd916494b057a305dddd3158c555f9.jpeg)
总结
![640?wx_fmt=jpeg](https://i-blog.csdnimg.cn/blog_migrate/1df02a1881afad93810975758c742826.jpeg)
作者简介:
韩旭,清华大学计算机科学与技术系博士三年级同学,主要研究方向为自然语言处理、知识图谱、信息抽取。在人工智能领域国际著名会议AAAI、ACL、EMNLP、COLING、NAACL上发表多篇论文,是OpenKE、OpenNRE等开源项目的开发者之一。
主页:https://thucsthanxu13.github.io
高天宇,清华大学计算机系大四本科生,主要研究方向为自然语言处理、知识图谱、关系抽取。在人工智能领域国际著名会议AAAI、EMNLP上发表多篇论文,是OpenNRE等开源项目的主要开发者之一。
主页:https://gaotianyu.xyz/about/
刘知远,清华大学计算机系副教授、博士生导师。主要研究方向为表示学习、知识图谱和社会计算。
主页:http://nlp.csai.tsinghua.edu.cn/~lzy/
参考文献
[1] ChunYang Liu, WenBo Sun, WenHan Chao, Wanxiang Che. Convolution Neural Network for Relation Extraction. The 9th International Conference on Advanced Data Mining and Applications (ADMA 2013).
[2] Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, Jun Zhao. Relation Classification via Convolutional Deep Neural Network. The 25th International Conference on Computational Linguistics (COLING 2014).
[3] Dongxu Zhang, Dong Wang. Relation Classification via Recurrent Neural Network. arXiv preprint arXiv:1508.01006 (2015).
[4] Peng Zhou, Wei Shi, Jun Tian, Zhenyu Qi, Bingchen Li, Hongwei Hao, Bo Xu. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. The 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).
[5] Richard Socher, Brody Huval, Christopher D. Manning, Andrew Y. Ng. Semantic Compositionality through Recursive Matrix-Vector Spaces. The 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2012).
[6] Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid Ó Séaghdha, Sebastian Padó, Marco Pennacchiotti, Lorenza Romano, Stan Szpakowicz. SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations between Pairs of Nominals. The 5th International Workshop on Semantic Evaluation (SEMEVAL 2010).
[7] Thien Huu Nguyen, Ralph Grishman. Relation Extraction: Perspective from Convolutional Neural Networks. The 1st Workshop on Vector Space Modeling for Natural Language Processing (LatentVar 2015).
[8] Cícero dos Santos, Bing Xiang, Bowen Zhou. Classifying Relations by Ranking with Convolutional Neural Networks. The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015).
[9] Shu Zhang, Dequan Zheng, Xinchen Hu, Ming Yang. Bidirectional Long Short-Term Memory Networks for Relation Classification. The 29th Pacific Asia Conference on Language, Information and Computation (PACLIC 2015).
[10] Minguang Xiao, Cong Liu. Semantic Relation Classification via Hierarchical Recurrent Neural Network with Attention. The 26th International Conference on Computational Linguistics (COLING 2016).
[11] Kun Xu, Yansong Feng, Songfang Huang, Dongyan Zhao. Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling. The 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015).
[12] Yan Xu, Lili Mou, Ge Li, Yunchuan Chen, Hao Peng, Zhi Jin. Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths. The 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015).
[13] Yang Liu, Furu Wei, Sujian Li, Heng Ji, Ming Zhou, Houfeng Wang. A Dependency-Based Neural Network for Relation Classification. The 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015).
[14] Yan Xu, Ran Jia, Lili Mou, Ge Li, Yunchuan Chen, Yangyang Lu, Zhi Jin. Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation. The 26th International Conference on Computational Linguistics (COLING 2016).
[15] Rui Cai, Xiaodong Zhang, Houfeng Wang. Bidirectional Recurrent Convolutional Neural Network for Relation Classification. The 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).
[16] Mike Mintz, Steven Bills, Rion Snow, Daniel Jurafsky. Distant Supervision for Relation Extraction without Labeled Data. The 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2009).
[17] Daojian Zeng, Kang Liu, Yubo Chen, Jun Zhao. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks. The 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015).
[18] Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, Maosong Sun. Neural Relation Extraction with Selective Attention over Instances. The 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).
[19] Yi Wu, David Bamman, Stuart Russell. Adversarial Training for Relation Extraction. The 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).
[20] Jun Feng, Minlie Huang, Li Zhao, Yang Yang, Xiaoyan Zhu. Reinforcement Learning for Relation Classification from Noisy Data. The 32th AAAI Conference on Artificial Intelligence (AAAI 2018).
[21] Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, Maosong Sun. FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation. The 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018).
[22] Tianyu Gao, Xu Han, Zhiyuan Liu, Maosong Sun. Hybrid Attention-based Prototypical Networks for Noisy Few-Shot Relation Classification. The 33th AAAI Conference on Artificial Intelligence (AAAI 2019).
[23] Zhi-Xiu Ye, Zhen-Hua Ling. Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification. The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).
[24] Livio Baldini Soares, Nicholas FitzGerald, Jeffrey Ling, Tom Kwiatkowski. Matching the Blanks: Distributional Similarity for Relation Learning. The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).
[25] Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou. FewRel 2.0: Towards More Challenging Few-Shot Relation Classification. 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019).
[26] Chris Quirk, Hoifung Poon. Distant Supervision for Relation Extraction beyond the Sentence Boundary. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017).
[27] Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, Wen-tau Yih. Cross-Sentence N-ary Relation Extraction with Graph LSTMs. Transactions of the Association for Computational Linguistics (TACL 2017).
[28] Chih-Hsuan Wei, Yifan Peng, Robert Leaman, Allan Peter Davis, Carolyn J. Mattingly, Jiao Li, Thomas C. Wiegers, Zhiyong Lu. Overview of the BioCreative V Chemical Disease Relation (CDR) Task. The 5th BioCreative Challenge Evaluation Workshop (BioC 2015).
[29] Omer Levy, Minjoon Seo, Eunsol Choi, Luke Zettlemoyer. Zero-Shot Relation Extraction via Reading Comprehension. The 21st Conference on Computational Natural Language Learning (CoNLL 2017).
[30] Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, Maosong Sun. DocRED: A Large-Scale Document-Level Relation Extraction Dataset. The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).
[31] Ruidong Wu, Yuan Yao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun. Open Relation Extraction: Relational Knowledge Transfer from Supervised Data to Unsupervised Data. 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019).
[32] Tianyu Gao, Xu Han, Ruobing Xie, Zhiyuan Liu, Fen Lin, Leyu Lin, Maosong Sun. Neural Snowball for Few-Shot Relation Learning. The 34th AAAI Conference on Artificial Intelligence (AAAI 2020).
[33] Xu Han, Tianyu Gao, Yuan Yao, Deming Ye, Zhiyuan Liu, Maosong Sun. OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction. The Conference on Empirical Methods in Natural Language Processing (EMNLP 2019).
——END——