Entity-Relation Extraction as Multi-turn Question Answering

最新推荐文章于 2021-07-21 10:02:09 发布

SpaceTime1999

最新推荐文章于 2021-07-21 10:02:09 发布

阅读量670

点赞数

分类专栏： NLP论文阅读

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/SpaceTime1999/article/details/109200066

版权

NLP论文阅读专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Entity-Relation Extraction as Multi-turn Question Answering

发现领域内的问题
- task formalization level
  - 三元组本身的知识表达能力有限，比如，Musk case里的hierarchical dependency，时间、地点、职位、人物需要在一个更高维度的空间表达
- algorithm level
  - 输入：a raw sentence with two marked mentions
    输出：whether a relation holds between the two mentions
  - hard for neural models to capture all the lexical, semantic and syntactic cues in this formalization
    - (1) entities are far away;
      (2) one entity is involved in** multiple triplets;**
      or (3) relation spans have overlaps
related work
- Extracting Entities and Relations
  - pipelined approach
    - 优点：**flexibility **of integrating different data sources and learning algorithms
    - 缺点：suffer significantly from error propagation
  - joint approach
    - through various dependencies
      - constraints solved by integer linear programming
    - card-pyramid parsing
    - global probabilistic graphical models
    - structured perceptron with efficient beamsearch
    - table-filling approach
      - search orders in decoding and global features
    - shared parameters，end-to-end approach that extract entities and their relations using neural network models
      - neural tagging model，multi-class classification model based on tree LSTMs
    - multi-level attention CNNs
    - seq2seq models to generate entity-relation triples
  - reinforcement learning or Minimum Risk Training
    - a global loss function to jointly train the two models under the framework work of Minimum Risk Training
    - hierarchical reinforcement learning
- Machine Reading Comprehension, predicting answer spans given context
  - 主要做的是 extract text spans in passages given queries
  - 一种思路可以简化成 two multi-class classification tasks
  - 另一种思路，对于multi-passage MRC, directly concatenating passages, first rank the passages and then run single-passage MRC on the selected passage
  - 其他有用的：Pretraining methods like BERT or Elmo
  - 趋势：a tendency of casting non-QA NLP tasks as QA tasks
  - 具体的比如
    - BiDAF、QANet
本论文的工作
- inspiration 来源
  - identifying the relation between two predefined entities and the authors formalize the task of relation extraction as a single-turn QA task
    Levy et al. (2017). Levy et al. (2017) and McCann et al. (2018)
- Idea
  - model hierarchical tag dependency in multi-turn QA， identifying answer spans from the context
    each entity type and relation type is characterized by a question answering template, and entities and relations are extracted by answering template questions
  - question query encodes
  - jointly modeling entity and relation
  - exploit the well developed machine reading comprehension (MRC) models
  - multi-step reasoning to construct entity dependencies
- advantages
  - capture the** hierarchical dependency of tags**, progressively obtain the entities we need for the next turn , closely akin to** the multi-turn slot filling dialogue system**
  - the question query encodes important** prior information** for the relation class we want to identify
  - the QA framework provides a natural way to simultaneously extract entities and relations: most MRC models support outputting special NONE tokens, indicating that there is no answer to the question
- dataset
  - ACE04, ACE05 and the CoNLL04 corpora
  - a newly developed dataset RESUME in Chinese
  extract biographical information of individuals from raw texts. The construction of structural knowledge base from RESUME requires four or five turns of QA
  * 最大的特点：one person can work for **different **companies during **different **periods of time and that one person can hold **different **positions in **different **periods of time for the **same **company
- model
  - 分解成了两个子任务：a multi-answer task for head-entity extraction + a single-answer task for joint relation and tail-entity extraction
  - 第一阶段：head-entity extraction，extract this starting entity, we** transform each entity type to a question** using EntityQuesTemplates
    - 这个阶段抽取到的不一定就是head entities
  - 第二阶段：The relation and the tail-entity extraction，定义了relations chain用于multi-turn QA，因为一些的抽取取决于其他的抽取
  - Generating Questions using Templates
    - type-specific
    - natural language questions or pseudo-questions
  - Extracting Answer Spans via MRC
    - backbone：BERT，基于多轮问答的问题对Traditional MRC models做了调整：predict a BMEO (beginning, inside, ending and outside) label
    - Training and Test
      - $\mathcal{L} = ( 1 - \lambda ) \mathcal{L} ( \text { head-entity } ) + \lambda \mathcal{L} ( \text { tail-entity, rel} )$
      - 训练的时候两个共享参数，测试的时候head-entities and tail-entities are extracted separately，λ 控制两个子任务的 tradeoff
  - Reinforcement Learning
    - 一个turn中抽取的答案还会影响downstream turns，也影响later accuracies
    - 由于multi-turn dialogue generation的结果比较好，所以打算也用reinforcement learning
      (Mrkˇsi´c et al., 2015; Li et al., 2016; Wen et al., 2016)
    - action：selecting a text span in each turn
    - policy：probability of selecting a certain span given the question and the context
      $\left. \begin{array} { l } { p ( y ( w _ { 1 } , \ldots , w _ { n } ) = \text { answer } | \text { question, } s ) } \\ { = p ( w _ { 1 } = \mathrm{B} ) \times p ( w _ { n } = \mathrm{E} ) \prod _ { i \in [ 2 , n - 1 ] } p ( w _ { i } = \mathrm{M} ) } \end{array} \right.$
    - Reward：特定句子，用正确抽取的triples作为奖励，maximizes the expected reward $\pi } [ R ( w ) ]$ ，通过sampling from the policy $π$ 来近似
    - gradient的计算：likelihood ratio:
      $\nabla E ( \theta ) \approx [ R ( w ) - b ] \nabla \log \pi ( y ( w ) | \text { question } s ) ) $
      b 是 baseline value（所有之前奖励的平均）, 每轮答对就奖励+1，final reward是所有轮次积累的奖励
    - policy networks initialization：pre-trained head-entity and tail-entity extraction model
    - experience replay strategy：for each batch, half of the examples are simulated and the other half is randomly selected from previously generated examples.
    - strategy of curriculum learning 用在了 RESUME dataset 上，gradually increase the number of turns from 2 to 4 at training
- Experimental Results（ SOTA results）
  - 指标：micro-F1 scores, precision and recall
  - 自己搞的数据集RESUME上的结果
    - 先确定一个baseline：tagging+relation，entity+dependency
    - entities 部分用BERT tagging models，relations部分用 CNN to representations output by BERT transformers
    - 这个任务akin to a dependency parsing task at the tag-level rather than the word-level
    - 具体做法：通过BERT tagging model给每个word分配tagging labels, 然后调整SOTA dependency parsing model Biaffine来construct dependencies between tags（jointly trained）
  - 比较常用的ACE04, ACE05 and CoNLL04上的结果
- Ablation Studies
  - Effect of Question Generation Strategy：natural language questions会比pseudo-questions更好，因为提供的是more fine-grained semantic information
  - Effect of Joint Training： λ 间隔0.1地测了10个数据，其中，entity-extraction并不是在 λ = 0 的时候最好（说明relation extraction的部分可以提升entity extraction的效果）
  - Case Study：和SOTA MRT model对比，可以识别出相隔比较远的实体; 当句子里包含两对相同关系的时候，也可以识别出来
- 后续发展空间
  - could easilty integrate reinforcement learning (just as in multi-turn dialog systems)

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

SpaceTime1999 CSDN认证博客专家 CSDN认证企业博客

码龄7年

1: 原创

67万+: 周排名

126万+: 总排名

670: 访问

: 等级

13: 积分

0: 粉丝

0: 获赞

0: 评论

1: 收藏

私信

关注

热门文章

Entity-Relation Extraction as Multi-turn Question Answering 670

分类专栏

NLP论文阅读 1篇

最新文章

目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。