前言
之前研究事件抽取领域(NLP一个小领域信息抽取的子领域), 之前整理过一些文献。
本文是 事件抽取文献整理(2020-2021) 的后续。
事件抽取文献整理(2020-2021)
+
事件抽取文献整理(2019)
+
事件抽取文献整理(2018)
+
事件抽取文献整理(2008-2017)
模型标题后有$代表有给代码
论文
2019
Liu et al. $
Neural Cross-Lingual Event Detection with Minimal Parallel Resources (aclanthology.org)
源码: facebookresearch/MUSE (github.com)
跨语言
Doc2EDAG
Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction (tsinghua.edu.cn)
源码: dolphin-zs/Doc2EDAG (github.com)
用了三个transformer块。
Ananya et al
Cross-lingual Structure Transfer for Relation and Event Extraction (aclanthology.org)
无官方源码
跨语言
Chau et al.
Open-domain Event Extraction and Embedding for Natural Gas Market Prediction (arxiv.org)
源码:
minhtriet/gas_market (github.com)
这个的目的是预测Natural Gas Market
GAIL-ELMo
Joint Entity and Event Extraction with Generative Adversarial Imitation Learning (illinois.edu)
无官方源码
GAN
ODEE-FER $
Open Domain Event Extraction Using Neural Latent Variable Models (aclanthology.org)
源码: lx865712528/ACL2019-ODEE (github.com)
DYGIE++ $
Entity, Relation, and Event Extraction with Contextualized Span Representations (aclanthology.org)
源码: dwadden/dygiepp (github.com)
使用了allenNLP
bert, multi-task, graph
HMEAE $
[HMEAE: Hierarchical Modular Event Argument Extraction (aclanthology.org)](https://aclanthology.org/D19-1584.pdf0
源码:thunlp/HMEAE (github.com) tensorflow
这个仓库里有DMCNN, 然后DMBERT的可见这个仓库 Bakser/DMBERT: A temporary repo to share the DMBERT code for Event Detection (github.com) pytorch
AC 59.3的F1很是让人吃惊~
Han et al.
Joint Event and Temporal Relation Extraction with Shared Representations and Structured Prediction (aclanthology.org)
无官方源码
bert+bilstm
PLMEE
Exploring Pre-trained Language Models for Event Extraction and Generation (aclanthology.org)
有bert
这个不是官方代码: boy56/PLMEE (github.com), 暂时还没找到官方代码。我看这个boy56/PLMEE的非官方代码使用了AIlenNLP这个库,使得封装更为严重,跟不利于借鉴器模块。
本文解决的是EE问题,提出PLMEE模型,模型由事件抽取模型和生成模型两部分组成,这两个模块都使用到了预训练语言模型来引入更丰富的知识。
根据不同角色对该类型事件的重要性,对损失函数的权重进行了重分配。
本文的EE模型,是先对触发器进行抽取,然后对元素进行抽取,得到元素对应的角色标签。这是一个pipeline的学习过程,损失函数在元素抽取器之后,没有直接对触发器抽取进行优化,可能会出现误差传播问题。
另外,正如作者所说,触发器抽取模块和元素抽取模块,直接利用BERT生成的嵌入表示,没有考虑不同触发器间的关联以及不同元素间的关联。生成模块由于重写adjunct tokens可能会改变原句的语义,因此面临着角色偏离问题。
有一篇读后感的csdn博客:【论文解读 ACL 2019 | PLMEE】Exploring Pre-trained Language Models for Event Extraction and Generation
AEM
Open Event Extraction from Online Text using a Generative Adversarial Network (aclanthology.org)
无官方源码
GAN
JointTransition $
Extracting Entities and Events as a Single Task Using a Transition-Based Neural Model (ijcai.org)
源码: zjcerwin/TransitionEvent (github.com)
bert, bilstm
Li et al.
Biomedical Event Extraction based on Knowledge-driven Tree-LSTM (aclanthology.org)
无官方源码,不太建议搞,还得自己找tree-LSTM代码
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks(stanford.edu)
tree-LSTM源码: stanfordnlp/treelstm (github.com) 不过不是pytorch实现的
tree-LSTM
MLM-Joint
给了Weili-NLP/EventSchemasBasedOnFrameNet (github.com), 但无源码
Joint3EE
One for all: Neural joint modeling of entities and events (arXiv)
无官方源码
Chan et al. $
Rapid Customization for Event Extraction (aclanthology.org)
Our system (code, UI, documentation, demonstration video) is released as open source.1
源码: BBN-E/Rapid-customization-events-acl19 (github.com)
Position embeddings (PE):
P
E
t
PE_t
PEt encodes the relative distance of each word to the trigger word.
P
E
a
PE_a
PEa to encode relative distances to the candidate argument.