事件抽取文献整理(2020-2021)

Andy Dennis

已于 2022-07-25 18:25:51 修改

阅读量1.5k

点赞数 3

分类专栏：文献阅读文章标签：自然语言处理机器学习数据挖掘事件抽取

于 2022-07-15 19:19:51 首次发布

本文链接：https://blog.csdn.net/weixin_43850253/article/details/125810944

版权

文献阅读专栏收录该内容

20 篇文章 1 订阅

订阅专栏

前言

之前研究事件抽取领域(NLP一个小领域信息抽取的子领域), 之前整理过一些文献。

事件抽取文献整理(2020-2021)
+
事件抽取文献整理(2019)
+
事件抽取文献整理(2018)
+
事件抽取文献整理(2008-2017)

模型综述
图片来自: A Compact Survey on Event Extraction: Approaches and Applications

之前看的时候还看了这篇描述 NLP 事件抽取综述（中）—— 模型篇

模型中有$代表有给代码

论文

2021

Gen-arg $

Document-Level Event Argument Extraction by Conditional Generation (aclanthology.org)

使用了Bart模型, 但个人看了官方源码觉得不全

BRAD

Event Extraction from Historical Texts: A New Dataset for Black Rebellions (aclanthology.org)
无官方源码
提出了一个新的数据集(论文没有给公开的数据集链接), 是本文的主要贡献点。
a corpus of nineteenth-century African American newspapers.
Our dataset features 5 entity types, 12 event types, and 6 argument roles that concern slavery and black movements between the eighteenth and nineteenth centuries.

TEXT2EVENT $

原文: https://aclanthology.org/2021.acl-long.217.pdf
代码: luyaojie/Text2Event (github.com)
如何融合使用shcema去constraint decode过程或许可以参考

CasEE $

CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction (aclanthology.org)
代码: JiaweiSheng/CasEE: Source code for ACL 2021 finding paper: CasEE: A Joint Learning Framework with Cascade Decoding for Overlapping Event Extraction (github.com)
面向中文
我尝试了一下环境，发现没啥问题，能跑起来
也简单看了一遍代码

这篇文章其实是参考CasRel (arxiv.org), 一个三元组关系抽取任务。将这个范式迁移到事件抽取中。

CasEE 架构:

使用了CLN(Conditioned LayNorm)和 MSA(multiHead Self-Attention)

利用双指针， start pos, end pos, 但是缺点是阈值需要手动设定 We select tokens with 数学公式: $\hat{t}^{sc}_i > ξ_2$ as the start positions, and those with 数学公式: $\hat{t}^{ec}_i > ξ_3$ as end positions, where 数学公式: $ξ_2, ξ_3 ∈ [0, 1]$ are scalar thresholds.

在论元分类的时候，还有个type_soft_constrain的操作

p_s = torch.sigmoid(self.head_cls(inp))  # [b, t, l]
p_e = torch.sigmoid(self.tail_cls(inp))

type_soft_constrain = torch.sigmoid(self.gate_linear(type_emb))  # [b, l]
type_soft_constrain = type_soft_constrain.unsqueeze(1).expand_as(p_s)
p_s = p_s * type_soft_constrain
p_e = p_e * type_soft_constrain

不同模型不同学习率, 另外 get_cosine_schedule_with_warmup 可见这个例子: 情感分析bert家族 pytorch实现(ing)

def set_learning_setting(self, config, train_loader, dev_loader, model):
        instances_num = len(train_loader.dataset)
        train_steps = int(instances_num * config.epochs_num / config.batch_size) + 1

        print("Batch size: ", config.batch_size)
        print("The number of training instances:", instances_num)
        print("The number of evaluating instances:", len(dev_loader.dataset))

        bert_params = list(map(id, model.bert.parameters()))

        other_params = filter(lambda p: id(p) not in bert_params, model.parameters())
        optimizer_grouped_parameters = [{'params': model.bert.parameters()}, {'params': other_params, 'lr': config.lr_task}]

        optimizer = AdamW(optimizer_grouped_parameters, lr=config.lr_bert, correct_bias=False)
        scheduler = get_linear_schedule_with_warmup(optimizer, num_warmup_steps=train_steps * config.warmup, num_training_steps=train_steps)

CLEVE $

CLEVE: Contrastive Pre-training for Event Extraction (aclanthology.org)
代码: THU-KEG/CLEVE (github.com)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Graph Isomorphism Network
Here we use a state-of-the-art GNN model, Graph Isomorphism Network (Xu et al., 2019), as our graph encoder for its strong representation ability.

FEAE

Trigger is Not Sufficient: Exploiting Frame-aware Knowledge for Implicit Event Argument Extraction (aclanthology.org)
无官方源码

MRC-based Argument Extraction
Teacher-student Framework

GIT $

Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker (aclanthology.org)
源码: Document-level Event Extraction via Heterogeneous Graph-based Interaction Model with a Tracker (aclanthology.org)

作者在AI Drive分享GIT的视频分享中也说了，一开始并不是end to end训练的，而是先给了gold label, 慢慢再替换为模型的输出

tracker 非并行成为模型运行的速度瓶颈，另外，论元抽取的顺序需要预先定义
例如这里的Equity Freeze需要手工定义Equity Holder -> FrozeShare -> StartDate…
这个需要训练才能发现好坏

github是金融数据集

NoFPFN $

Revisiting the Evaluation of End-to-end Event Extraction (aclanthology.org)
源码: dolphin-zs/Doc2EDAG (github.com)

reinforcement learning， to support diverse preferences of evaluation metrics motivated by different scenarios, we propose a new training paradigm based on reinforcement learning for a typical end-to-end EE model,

GATE $

GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction (arxiv.org)

Ahmad, W. U., Peng, N., & Chang, K.-W. (2021). GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12462-12470. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17478

源码:wasiahmad/GATE (github.com)
跨语言

DualQA

What the Role is vs. What Plays the Role: Semi-Supervised Event Argument Extraction via Dual Question Answering | Proceedings of the AAAI Conference on Artificial Intelligence
无官方源码

GRIT $

GRIT: Generative Role-filler Transformers for Document-level Event Entity Extraction (aclanthology.org)

源码: xinyadu/grit_doc_event_entity (github.com)

Event Entity Extraction

Partially causal masking strategy

Wen et al.

Event Time Extraction and Propagation via Graph Attention Networks (aclanthology.org)
无官方源码

2020

SciBERT $

Biomedical Event Extraction as Multi-turn Question Answering (aclanthology.org)
源码:allenai/scibert: A BERT model for scientific text. (github.com)

Biomedical event extraction
describing specific relationships between multiple molecular entities, such as genes, proteins, or cellular components

可视化工具 BioNLP Shared Task 2011: Supporting Resources (aclanthology.org)

模型结构图:

Du et al. $

Event Extraction by Answering (Almost) Natural Questions (aclanthology.org)
源码: xinyadu/eeqa: Event Extraction by Answering (Almost) Natural Questions (github.com)
ll同学在用这篇，先放放，看他怎么说

Min et al.

Towards Few-Shot Event Mention Retrieval: An Evaluation Framework and A Siamese Network Approach (aclanthology.org)
无官方源码

Sample pairs that are both in the query, and assign them the same class label.
Sample pairs such that one of them is in the query but the other is not, and assign this pair the not in same class label.

Chen et al.

Reading the Manual: Event Extraction as Definition Comprehension (aclanthology.org)
无官方源码

在这里插入图片描述

主要可以面向零样本和少样本
暂时没看懂Approach部分…
trigger cls: 72.9
arg cls: 42.4

EEGCN $

Edge-Enhanced Graph Convolution Networks for Event Detection with Syntactic Relation (aclanthology.org)

源码: cuishiyao96/eegcned (github.com)

图模型有些奇特
Edge-Aware Node Update Module first aggregates information from neighbors of each node through specific edge, and Node-Aware Edge Update module refines the edge representation with its connected nodes.