(2015ACL) Classifying Relations by Ranking with Convolutional Neural Networks

最新推荐文章于 2021-10-08 17:07:57 发布

CrazyBlog

最新推荐文章于 2021-10-08 17:07:57 发布

阅读量881

点赞数 1

分类专栏：关系抽取RelationExtraction

本文链接：https://blog.csdn.net/qq_21097885/article/details/103224720

版权

关系抽取RelationExtraction 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

论文： https://www.aclweb.org/anthology/P15-1061 或者 https://pan.baidu.com/s/1qFGhrMIO31s0pVvv0eJkMQ
机构： IBM
模型： CR-CNN

数据： SemEval2010-Task8
结果： F1值，84.1%
备注： 只利用了预训练好的词向量表示

输入： 一个包含两个标注实体的句子，例如 The [car] left the [plant]
输出： 一个向量，各维度代表各个关系的概率

在这里插入图片描述
过程：

Word Embeddings
通用的获取词向量流程：
预先训练好的embedding matrix 为： $W^{wrd} \in \mathbb{R}^{d_w \times |V|}$ ， $d_w$ 为词向量的维度， $∣ V ∣$ 为单词总数
列向量 $W^{wrd}_i \in \mathbb{R}^{d_w }$ 即为第 $i$ 个词对应的词向量

词 $w$ 得到其词向量 $r_w$ 的公式为： $r_w=W^{wrd}v^w$ ， $v^w$ 是一个 $w$ 维度为1，其余维度为0的总维度为 $∣ V ∣$ 的向量
Word Position Embeddings

同 (2014COLING) Relation Classification via Convolutional Deep Neural Network 那篇论文中PF的获取方式

$wpe^w = [wp_1, wp_2]$ ， $wp_1$ 和 $wp_2$ 为 $d_{wpe}$ 维的向量

于是，句子 $x$ 转化为向量表示 $emb_x$ = { $r^{w_1},wpe^{w_1}]$ , $r^{w_2},wpe^{w_2}]$ ,…, $r^{w_N},wpe^{w_N}]$ }
(备注：实例图中每个词的维度为 $d_w$ ，没有考虑position embeddings)
Sentence Representation

利用CNN提取句子 $x$ 的特征向量 $r_x$
Class embeddings and Scoring [论文创新点]
普通方式为：将特征向量输入softmax classifier得到最终结果
本文的创新点为：

模型为每种关系学习一个向量表示
$W^{classes}$ 为关系对应的embedding matrix，每一列为一种关系对应的向量表示
关系c的向量表示即为 $W^{classes}]_c$ ，维度与句子 $x$ 的特征向量 $r_x$ 相同
于是，两向量相乘 $r_x^T [W^{classes}]_c$ ，可以得到一个值
至此，模型为句子 $x$ 、关系 $c$ 学到一个值： $s_\theta(x)_c=r_x^T [W^{classes}]_c$
其中， $\theta$ 表示模型的所有参数

训练过程中，一个句子 $x$ 对应一个正例 $y^+$ 和一个负例 $c^-$
$y^+$ 是句子真正对应的关系类别， $c^-$ 则为其它关系类别中的一个
正例对应的值为 $s_\theta(x)_{y^+}$ ，负例对应的值为 $s_\theta(x)_{c^-}$
损失函数为 $L=log(1+exp(\gamma(m^+-s_\theta(x)_{y^+})))+log(1+exp(\gamma(m^-+s_\theta(x)_{c^-})))$
$m^+$ 和 $m^-$ 为margin值， $\gamma$ 为放大系数
训练过程中，使 $s_\theta(x)_{y^+}$ 逐渐大于 $m^+$ ， $s_\theta(x)_{c^-}$ 逐渐小于 $m^-$

原文：
- The proposed network learns a distributed vector representation for each relation class.
- Given an input text segment, the network uses a convolutional layer to produce a distributed vector representation of the text and compares it to the class representations in order to produce a score for each class.
- We propose a new pairwise ranking loss function that makes it easy to reduce the impact of artificial classes.

BibTeX：

@inproceedings{dos-santos-etal-2015-classifying,
    title = "Classifying Relations by Ranking with Convolutional Neural Networks",
    author = "dos Santos, C{\'\i}cero  and
      Xiang, Bing  and
      Zhou, Bowen",
    booktitle = "Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
    month = jul,
    year = "2015",
    address = "Beijing, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P15-1061",
    doi = "10.3115/v1/P15-1061",
    pages = "626--634",
}