【文本匹配】之 RE2论文详解

最新推荐文章于 2023-01-13 19:28:13 发布

尽量不躺平的kayla

最新推荐文章于 2023-01-13 19:28:13 发布

阅读量1.8k

点赞数

分类专栏：文本匹配 nlp Python 文章标签： python 自然语言处理深度学习

本文链接：https://blog.csdn.net/skying159/article/details/124333571

版权

RE2 - Simple and Effective Text Matching with Richer Alignment Features

这篇论文来自阿里，19年的ACL论文。《Simple and Effective Text Matching with Richer Alignment Features》：https://arxiv.org/abs/1908.00300

Intro

很多深层网络只拥有一层alignment layer，导致模型需要很多额外的语义信息或手工特征或复杂alignment机制或后处理。

本文的创新点就在于用multiple alignment processes。

R - Residual vectors：previous aligned features

E - Embedding vectors：original point-wise features

E - Encoded vectors：contextual features

简称RE2

具体代表什么呢？让我们往下看。

Model

在这里插入图片描述

空白格子表示embedding vectors，斜线方格表示augmented residual connections，经过一个encoder生成的context vectors用黑色方格表示。如图所示，把这三个向量concat起来都放进alignment layer里，再用alignment layer的input和output都concat起来放入fusion layer中。一个block包含encoding、alignment和fusion三层，重复N次且每个block都是独立的参数。 fusion layer的output经过池化层，得到最后的固定长度向量。利用左右两侧的固定长度向量做预测，Loss采用交叉熵。

Augmented Residual Connections

为了给alignment layer（attention layer）提供更丰富的特征，RE2用了残差网络来连接连续的n个blocks。

The input of the $n$ -th block $x^{(n)}$ ( $n$ ≥ 2), is the concatenation of the input of the first block $x^{(1)}$ and the summation of the output of previous two blocks (denoted by rectangles with diagonal stripes in Figure 1):