【论文笔记】A Symmetric Local Search Network for Emotion-Cause Pair Extraction

最新推荐文章于 2023-03-16 16:05:37 发布

油条生煎

最新推荐文章于 2023-03-16 16:05:37 发布

阅读量289

点赞数

分类专栏：自然语言处理

本文链接：https://blog.csdn.net/m0_46261993/article/details/121167573

版权

情感分析因果关系局部搜索神经网络从句匹配

关键词由CSDN通过智能技术生成

自然语言处理专栏收录该内容

3 篇文章 0 订阅

订阅专栏

A Symmetric Local Search Network for Emotion-Cause Pair Extraction

文章目录

A Symmetric Local Search Network for Emotion-Cause Pair Extraction

Abstract

要解决的问题：Emotion-cause pair extraction (ECPE)

已有的方案：a two-step method (Xia & Ding, 2019)

Xia & Ding (2019) 的缺点：未考虑情感从句与原因从句之间的相关性

To tackle this task, a two-step method was proposed by previous study which first extracted emotion clauses and cause clauses individually, then paired the emotion and cause clauses, and filtered out the pairs without causality.

作者的工作核心：local search

Symmetric Local Search Network (SLSN): perform the detection and matching simultaneously by local search

SLSN 的两个对称的子网络：

the emotion subnetwork
the cause subnetwork

两个子网络的组成部分：

a clause representation learner - a specially-designed cross-subnetwork component
a local pair searcher (LPS)

SLSN consists of two symmetric subnetworks, namely the emotion subnetwork and the cause subnetwork.

Each subnetwork is composed of a clause representation learner and a local pair searcher. The local pair searcher is a specially-designed cross-subnetwork component which can extract the local emotion-cause pairs.

Introduction

作者认为：模拟人类的行为，应该同时考虑情感/原因的检测与匹配

However, when humans deal with the ECPE task, they usually consider the detection and matching problems at the same time.

local search 的优点：可以忽略距离很远的 emotion-cause pair

The advantage of local search is that the wrong pairs (e.g., (c4, c12)) beyond the local context scope can be avoided. Additionally, when local searching the cause clause corresponding to the target emotion clause, humans not only judge whether the clause is a cause clause, but also consider whether it matches the target emotion clause.

LPS 引入 local context window 以限制 local search 的 context scope

Specifically, the LPS introduces a local context window to limit the scope of context for local search.

Symmetric Local Search Network

Task Definition

每个数据集 $D$ 中包含了多个文档 $d$
$[c_1,c_2,\cdots, c_n]$

令 $c^e$ 表示情感从句（emotion clause）
令 $c^c$ 表示原因从句（cause clause）

目标：提取所有的 emotion-cause pairs
$\{\cdots, (c^e, c^c), \cdots\}$

An Overview of SLSN

input - a sequence of clauses from a document
output - the local pair labels for these clauses

对于每个 $c_i$ ，SLSN 预测两种标签（label）：

E-LC label - $\hat y^{elc}_i$
- the emotion label (E-label) $\hat y^e_i$ of the i-th clause
- the local cause labels (LC-label) $(\hat y^c_{i-1},\hat y^c_i, \hat y^c_{i + 1})$ of the clauses near the i-th clause
C-LE label - $\hat y^{cle}_i$
- the cause label (C-label) $\hat y^c_i$ of the i-th clause
- the local emotion labels (LE-label) $(\hat y^e_{i-1},\hat y^e_i, \hat y^e_{i + 1})$ of the clauses near the i-th clause

令 $\hat y^{elc}_i$ 所对应的 E-C pair set 为 $P_{elc}$ ， $\hat y^{cle}_i$ 所对应的 E-C pair set 为 $P_{cle}$ ，则最终的 E-C pair set 为 $P_{elc} \cup P_{cle}$ 。

Components of SLSN

SLSN 的两个 subnetworks：

the emotion subnetwork (E-net)
- for the E-LC label prediction
the cause subnetwork (C-net)
- for the C-LE label prediction

E-net 与 C-net 相似的结构：词嵌入（word embedding）、clause encoder、隐层状态学习（hidden state learning）

E-net and C-net have similar structures in terms of word embedding, clause encoder, and hidden state learning.

Word Embedding

输入： $[c_1, c_2, \cdots, c_n]$ with $c_i = [w^1_i, w^2_i, \cdots, w^{li}_i]$ （ $c_i$ 包含了 $l i$ 个词）

输出： $v_i = [v^i_1, v^2_i, \cdots, v^{li}_i]$

Clause Encoder

结构：word-level Bi-LSTM & Attention

目的：学习从句的表示（learn the representation of clauses）

【以 E-net 为例】

Bi-LSTM 输入： $v_i = [v^i_1, v^2_i, \cdots, v^{li}_i]$

Bi-LSTM 输出 & Attention 输入： $r_i = [r^i_1, r^2_i, \cdots, r^{li}_i]$

Attention 将 $r_i$ 映射至 $s^e_i$ 并聚集它们：
$u^j_i = tanh(W_w r^j_i + b_w) \\ a^j_i = {{exp((u^j_i)^Tu_s)} \over {\sum_t exp((u^t_i)^Tu_s)}} \\ s^e_i = \displaystyle \sum_j a^j_i r^j_i$

Hidden State Learning

使用一个 clause-level Bi-LSTM

【以 E-net 为例】

输入： $[s^e_1, s^e_2, \cdots, s^{e}_n]$

输出： $[h^e_1, h^e_2, \cdots, h^{e}_n]$

同理，C-net 输出 $[h^c_1, h^c_2, \cdots, h^{c}_n]$

Local Pair Searcher

在 E-net 与 C-net 中构造对称的结构，以预测每个从句的 local pair labels

【以 E-net 为例】

在 E-net 预测 E-label 的时候，只使用从句的 emotion hidden state，并使用 softmax 进行预测：
$\hat y^e_i = softmax(W_eh^e_i + b_e)$
在 E-net 预测 LC-label 时：若当前从句非情感从句（即预测当前句子不是 emotion clause），则对应的 LC-label 是个零向量（zero vector）；否则，对所有 local context window 中的所有从句预测 LC-label。

LPS 首先计算每个 clause 的 emotion attention ratio $\lambda_j$ ：
$\gamma(h^e_i,h^c_j)=h^e_ih^c_j \\ \lambda_j = {{exp(\gamma(h^e_i,h^c_j))} \over {\sum^{i+k}_{j=i-k}exp(\gamma(h^e_i,h^c_j))}}$
其中， $\gamma(h^e_i,h^c_j)$ 是度量 local cause 与 target emotion 之间相关性的 emotion attention function，通过将 emotion hidden state 与 cause hidden state 相乘获得；使用 softmax 计算窗口内的 emotion attention ratio $\lambda _j$ ， $\lambda _j$ 被用于 scale (不知道 scale 怎么翻译) the origin hidden states：
$q^{lc}_j = \lambda_j \cdot h^c_j$
其中， $q^{lc}_j$ 就是第 j 个 local context clause 的 cause hidden state。

作者还使用了一个 local Bi-LSTM 层去学习每个 local context clause 的 contextualized reqresentation：
$\overrightarrow {o_j} = \overrightarrow {LSTM_{lc}}(q^{lc}_j), j \in [i-k,i+k] \\ \overleftarrow {o_j} = \overleftarrow {LSTM_{lc}}(q^{lc}_j), j \in [i-k,i+k]$

最终，将 $\overrightarrow {o_j}$ 与 $\overleftarrow {o_j}$ 进行拼接得到 $o_j$ ，通过 $o_j$ 预测第 j 个位置的 local context clause 的 LC-label $\hat y ^{lc}_j$ ：
$\hat y ^{lc} _j = softmax(W_{lc} o_j + b_{lc})$

Model Training

E-net 用于预测 E-LC label，C-net 用于预测 C-LE label，因此 SLSN 的 loss function 是两者的加权平均：
$\alpha L^{elc} + (1 - \alpha) L^{cle} \\ L^{elc} = \beta L^e + (1 - \beta) L^{lc} \\ L^{cle} = \beta L^c + (1 - \beta) L^{le}$
$L^e$ 、 $L^{lc}$ 、 $L^c$ 、 $L^{le}$ 是预测 E-label $\hat y^{e}_i$ 、LC-label $\hat y^{lc}_i$ 、C-label $\hat y^{c}_i$ 、LE-label $\hat y^{le}_i$ 的交叉熵损失：
$L^e = - {1 \over n} \displaystyle \sum^n_{i=1} \eta y^e_i log(\hat y^e_i) \\ L^{lc} = - {1 \over {p^e(2k+1)}} \displaystyle \sum^n_{i=1} I(\hat y^e_i = 1) \sum^{i+k}_{j = i - k} y^{lc}_j log(\hat y^{lc}_j) \\ L^c = - {1 \over n} \displaystyle \sum^n_{i=1} \eta y^c_i log(\hat y^c_i) \\ L^{le} = - {1 \over {p^c(2k+1)}} \displaystyle \sum^n_{i=1} I(\hat y^c_i = 1) \sum^{i+k}_{j = i - k} y^{le}_j log(\hat y^{le}_j)$
其中， $I(\cdot)$ 是个 indicator function。

参考文献

Rui Xia and Zixiang Ding. 2019. Emotion-cause pair extraction: A new task to emotion analysis in texts. In Proceedings ofthe 57th Conference ofthe Association for Computational Linguistics, pages 1003–1012.