实体链接(entity linking)的任务是识别出文本中的提及(mention) 并建立起到知识库实体(entity)的链接,将非结构化数据连接到结构化数据。实体链接利用知识库中大量实体的丰富信息,实现各种语义应用,如实体链接是很多信息抽取(IR)和自然语言理解(NLU)pipeline 中的重要组件,因为它能够消除文本中 mention 的歧义,并确定 mention 的正确含义。
本文按照 2020 年综述 “Neural Entity Linking: A Survey of Models Based on Deep Learning” 中对主流方法的分类,结合多个 Benchmark 的 SOTA 模型,整理出 2021 年及以前的 47 篇值得一读的实体链接领域的论文
General Architecture
Candidate Generation
surface form matching
- "Combining word and entity embeddings for entity linking" Jose G. Moreno, Romaric Besancon , Romain Beaumont (ESWC 2017) [paper] [code]
dictionary lookup
-
"Personalized page rank for named entity disambiguation" Maria Pershina, Yifan He, Ralph Grishman (NAACL 2015)[paper] [code]
-
"YAGO A core of semantic knowledge unifying wordnet and wikipedia" Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum (www 2007) [paper] [code]
prior probability
-
"A Cross-Lingual Dictionary for English Wikipedia Concepts" Valentin I. Spitkovsky, Angel X. Chang (LREC 2012) [paper] [code]
-
"Deep joint entity disambiguation with local neural attention" Octavian-Eugen Ganea, Thomas Hofmann (EMNLP 2017) [paper] [code]
Context-Mention Encoding
recurrent architecture
-
"Entity linking via joint encoding of types, descriptions, and context" Nitish Gupta, Sameer Singh, Dan Roth (EMNLP 2017) [paper] [code]
-
"End-to-end neural entity linking" Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann (CoNLL 2018) [paper] [code]
-
"Neural cross-lingual entity linking" Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza (AAAI 2018) [paper] [code]
self-attention
-
"Zero-shot entity linking by reading entity descriptions" Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee (ACL 2019) [paper] [code]
-
"Scalable Zero-shot Entity Linking with Dense Entity Retrieval" Ledell Wu, Fabio Petroni, Martin Josifoski (EMNLP 2020) [paper] [code]
-
"Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities" Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto (2019) [paper] [code]
Entity Encoding
pre-trained:基于原始标注文本
-
"Combining word and entity embeddings for entity linking" Jose G. Moreno, Romaric Besancon , Romain Beaumont (ESWC 2017) [paper] [code]
-
"Robust and Collective Entity Disambiguation through Semantic Embeddings Stefan" Stefan Zwicklbauer, Christin Seifert, Michael Granitzer (SIGIR 2016) [paper] [code]
-
"Jointly Embedding Entities and Text with Distant Supervision" Denis Newman-Griffis, Albert M Lai, Eric Fosler-Lussier (ACL 2018) [paper] [code]
-
"Improving neural entity disambiguation with graph embeddings" Özge Sevgili, Alexander Panchenko, Chris Biemann (ACL 2019) [paper] [code]
joint encoding and ranking
-
"Learning dense representations for entity retrieval" Daniel Gillick, Sayali Kulkarni, Larry Lansing, (CoNLL 2019) [[paper]](./papers/Gillick et al. - 2019 - Learning dense representations for entity retrieval.pdf) [code]
-
"Entity linking via joint encoding of types, descriptions, and context" Nitish Gupta, Sameer Singh, Dan Roth (EMNLP 2017) [paper] [code]
-
"Zero-shot entity linking by reading entity descriptions" Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee (ACL 2019) [paper] [code]
Unlinkable Mention Prediction
no candidate
-
"Neural cross-lingual entity linking" Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza (AAAI 2018) [paper] [code]
-
"Cross-lingual Wikification Using Multilingual Embeddings" Chen-Tse Tsai, Dan Roth (NAACL 2016) [paper] [code]
threshold
-
"Plato A Selective Context Model for Entity Resolution" Nevena Lazic, Amarnag Subramanya, Michael Ringgaard, Fernando Pereira (TACL 2015)[paper] [code]
-
"Knowledge enhanced word representation" Matthew E. Peters, Mark Neumann, Robert Logan (EMNLP 2019) [paper] [code]
NIL predictor
- "End-to-end neural entity linking" Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann (CoNLL 2018) [paper] [code]
separate model
-
"Joint learning of named entity recognition and entity linking" Pedro Henrique Martins, Zita Marinho, André F. T. Martins (ACL 2019) [paper] [code]
-
"Combining word and entity embeddings for entity linking" Jose G. Moreno, Romaric Besancon , Romain Beaumont (ESWC 2017) [paper] [code]
Modifications of the General Architecture
Joint Entity Recognition and Disambiguation Architecture
candidate based
-
"End-to-end neural entity linking" Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann (CoNLL 2018) [paper] [code]
-
"Knowledge enhanced word representation" Matthew E. Peters, Mark Neumann, Robert Logan (EMNLP 2019) [paper] [code]
multitask learning
- "Joint learning of named entity recognition and entity linking" Pedro Henrique Martins, Zita Marinho, André F. T. Martins (ACL 2019) [paper] [code]
sequence labeling
- "Investigating entity knowledge in BERT with simple neural end-to-end entity linking" Samuel Broscheit (CoNLL 2019) [paper] [code]
Global Context Architecture
random walk based
-
"Robust named entity disambiguation with random walks" Zhaochen Guo, Denilson Barbosa (SWJ 2016) [paper] [code]
-
"Personalized page rank for named entity disambiguation" Maria Pershina, Yifan He, Ralph Grishman (NAACL 2015)[paper] [code]
-
"Robust and Collective Entity Disambiguation through Semantic Embeddings Stefan" Stefan Zwicklbauer, Christin Seifert, Michael Granitzer (SIGIR 2016) [paper] [code]
maximization of CRF potentials
-
"Deep joint entity disambiguation with local neural attention" Octavian-Eugen Ganea, Thomas Hofmann (EMNLP 2017) [paper] [code]
-
"Improving entity linking by modeling latent relations between mentions" Phong Le, Ivan Titov (ACL 2018) [[paper]](./papers/Le, Titov - 2018 - Improving entity linking by modeling latent relations between mentions.pdf) [code]
sequential decision task
-
"Joint Entity Linking with Deep Reinforcement Learning" Zheng Fang, Yanan Cao, Dongjie Zhang (2019) [paper] [code]
-
"Learning dynamic context augmentation for global entity linking" Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang (EMNLP 2019) [paper] [code]
-
"Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities" Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto (2019) [paper] [code]
neural model component
-
"Neural Collective Entity Linking" Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu (COLING 2018)[paper] [code]
-
"Collective entity resolution with multi-focal attention" Amir Globerson, Nevena Lazic, Soumen Chakrabarti (ACL 2016) [paper] [code]
-
"End-to-end neural entity linking" Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann (CoNLL 2018) [paper] [code]
larger context
-
"Bridging text and knowledge by learning multi-prototype entity mention embedding" Yixin Cao, Lifu Huang, Heng Ji, Xu Chen, Juanzi Li (ACL 2017) [paper] [code]
-
"Entity linking via joint encoding of types, descriptions, and context" Nitish Gupta, Sameer Singh, Dan Roth (EMNLP 2017) [paper] [code]
-
"Knowledge enhanced word representation" Matthew E. Peters, Mark Neumann, Robert Logan (EMNLP 2019) [paper] [code]
Domain Independent Architecture
distant learning
-
"Distant learning for entity linking with automatic noise detection" Phong Le, Ivan Titov (ACL 2019) [paper] [code]
-
"Boosting Entity Linking Performance by Leveraging Unlabeled Documents" Phong Le, Ivan Titov (ACL 2019) [paper] [code]
zero-shot
-
"Learning dense representations for entity retrieval" Daniel Gillick, Sayali Kulkarni, Larry Lansing, (CoNLL 2019) [[paper]](./papers/Gillick et al. - 2019 - Learning dense representations for entity retrieval.pdf) [code]
-
"Zero-shot entity linking by reading entity descriptions" Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee (ACL 2019) [paper] [code]
-
"Scalable Zero-shot Entity Linking with Dense Entity Retrieval" Ledell Wu, Fabio Petroni, Martin Josifoski (EMNLP 2020) [paper] [code]
Cross-lingual Architecture
representation based
-
"Cross-lingual Name Tagging and Linking for 282 Languages" Xiaoman Pan, Boliang Zhang, Jonathan May (ACL 2017) [paper] [code]
-
"Cross-lingual Wikification Using Multilingual Embeddings" Chen-Tse Tsai, Dan Roth (NAACL 2016) [paper] [code]
zero-shot
-
"Neural cross-lingual entity linking" Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza (AAAI 2018) [paper] [code]
-
"Joint multilingual supervision for cross-lingual entity linking" Shyam Upadhyay, Nitish Gupta, Dan Roth ( EMNLP 2018) [paper] [code]
SOTA
AIDA-CoNLL
entity disambiguation
-
"Entity-aware ELMo Learning Contextual Entity Representation for Entity Disambiguation" Hamed Shahbazi, Xiaoli Z. Fern, Reza Ghaeini, (2019) [paper] [code] [Accuracy: 0.962]
-
"Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities" Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto (2019) [paper] [code] [Accuracy: 0.950]
-
"Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation Models" Isaiah Onando Mulang’, Kuldeep Singh, Chaitali Prabhu (CIKM 2020) [paper] [code] [Accuracy: 0.9494]
-
"DeepType Multilingual entity linking by neural type system evolution" Jonathan Raiman, Olivier Raiman (AAAI 2018) [paper] [code] [Accuracy: 0.909] [Accuracy: 0.9488]
-
"Learning Distributed Representations of Texts and Entities from Knowledge Base" Ikuya Yamada, Hiroyuki Shindo, Hideaki Takeda (TACL 2017) [paper] [code] [Accuracy: 0.931]
entity linking
-
"Autoregressive Entity Retrieval" Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni (2021) [paper] [code] [Micro-f1 strong: 0.837]
-
"CHOLAN A modular approach for neural entity linking on wikipedia and wikidata" Manoj Prabhakar Kannan Ravi, Kuldeep Singh (EACL 2021) [paper] [code] [Micro-f1 strong: 0.831]
-
"End-to-end neural entity linking" Nikolaos Kolitsas, Octavian-Eugen Ganea, Thomas Hofmann (CoNLL 2018) [paper] [code] [Micro-f1 strong: 0.826]
-
"Investigating entity knowledge in BERT with simple neural end-to-end entity linking" Samuel Broscheit (CoNLL 2019) [paper] [code] [Micro-f1 strong: 0.793]
TAC 2010
-
"Scalable Zero-shot Entity Linking with Dense Entity Retrieval" Ledell Wu, Fabio Petroni, Martin Josifoski (EMNLP 2020) [paper] [code] [Accuracy: 0.940]
-
"Neural Collective Entity Linking" Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu (COLING 2018)[paper] [code] [Accuracy: 0.910]
-
"DeepType Multilingual entity linking by neural type system evolution" Jonathan Raiman, Olivier Raiman (AAAI 2018) [paper] [code] [Accuracy: 0.909]
-
"ELDEN Improved entity linking using densified knowledge graphs" Priya Radhakrishnan, Partha Talukdar, Vasudeva Varma (NAACL 2018) [paper] [code] [Accuracy: 0.896]
-
"Entity disambiguation by knowledge and text jointly embedding" Wei Fang, Jianwen Zhang, Dilin Wang, Zheng Chen, Ming Li (CoNLL 2016) [paper] [code] [Accuracy: 0.889]
ACE 2004
-
"Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities" Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto (2019) [paper] [code] [Micro-f1: 0.919]
-
"Entity linking via joint encoding of types, descriptions, and context" Nitish Gupta, Sameer Singh, Dan Roth (EMNLP 2017) [paper] [code]
-
"Joint Entity Linking with Deep Reinforcement Learning" Zheng Fang, Yanan Cao, Dongjie Zhang (2019) [paper] [code] [Micro-f1: 0.912]
-
"Autoregressive Entity Retrieval" Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni (2021) [paper] [code] [Micro-f1: 0.901]
-
"Learning dynamic context augmentation for global entity linking" Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang (EMNLP 2019) [paper] [code] [Micro-f1: 0.901]
AQUAINT
-
"Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities" Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto (2019) [paper] [code] [Micro-f1: 0.935]
-
"Boosting Entity Linking Performance by Leveraging Unlabeled Documents" Phong Le, Ivan Titov (ACL 2019) [paper] [code] [Micro-f1: 0.907]
-
"Autoregressive Entity Retrieval" Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni (2021) [[paper] [code] [Micro-f1: 0.899]
-
"Deep joint entity disambiguation with local neural attention" Octavian-Eugen Ganea, Thomas Hofmann (EMNLP 2017) [paper] [code] [Micro-f1: 0.885]
-
"Improving entity linking by modeling latent relations between mentions" Phong Le, Ivan Titov (ACL 2018) [[paper]](./papers/Le, Titov - 2018 - Improving entity linking by modeling latent relations between mentions.pdf) [code]
MSNBC
- "Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities" Ikuya Yamada, Koki Washio, Hiroyuki Shindo, Yuji Matsumoto (2019) [paper] [code] [Micro-f1: 0.963]
KILT
-
"Autoregressive Entity Retrieval" Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni (2021) [paper] [code]
-
"Scalable Zero-shot Entity Linking with Dense Entity Retrieval" Ledell Wu, Fabio Petroni, Martin Josifoski (EMNLP 2020) [paper] [code]
-
"KILT a Benchmark for Knowledge Intensive Language Tasks" Fabio Petroni, Aleksandra Piktus, Angela Fan (NAACL 2021) [paper] [code]
DuEL
- "Overview of the CCKS 2019 Knowledge Graph Evaluation Track Entity" Xianpei Han, Zhichun Wang, Jiangtao Zhang (CCKS 2019) [paper] [code]
Evaluation & Datasets
-
"General entity annotator benchmarking framework" Ricardo Usbeck, Michael Röder (WWW 2015) [paper]
-
"Robust disambiguation of named entities in text" Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, (EMNLP 2011) [paper] AIDA
-
"Overview of the TAC 2010 Knowledge Base Population Track" Heng Ji , Ralph Grishman , Hoa Trang Dang [paper] TAC KBP 2010
-
"Large-Scale Named Entity Disambiguation Based on Wikipedia Data" Silviu Cucerzan (EMNLP 2007) [paper] MSNBC
-
"Learning to link with wikipedia" David Milne, Ian H. Witten (CIKM 2008) [[paper]](./papers/Milne, Witten - 2008 - Learning to link with wikipedia.pdf) AQUAINT
-
"Local and Global Algorithms for Disambiguation to Wikipedia" Lev Ratinov, Dan Roth, Doug Downey, Mike Anderson (ACL 2011)[paper] ACE2004
Comments
If you find any errors in the above information, please post it in Issues. Pull requests are welcomed for adding papers.
Comments
如果需要下载好的论文集,请在评论区留下邮箱地址📮