Recurrent Convolutional Neural Networks for Text Classification

最新推荐文章于 2022-11-11 20:18:13 发布

you_jinpeng

最新推荐文章于 2022-11-11 20:18:13 发布

阅读量555

点赞数 1

分类专栏：上学 # 论文

本文链接：https://blog.csdn.net/you_jinpeng/article/details/102853089

版权

上学同时被 2 个专栏收录

21 篇文章 2 订阅

订阅专栏

论文

9 篇文章 0 订阅

订阅专栏

1.Abstract

传统：Traditional text classifiers often rely on many human-designed features, such as dictionaries, knowledge bases and special tree kernels.

提出：
a recurrent structure ——> capture contextual information
a max-pooling layer——>capture the key components in texts

表现
particularly on document-level datasets.

2.Introduction

传统

feature representation:
bag-of-words: where unigrams, bigrams, n-grams or some exquisitely designed patterns are typically extracted as features.

several feature selection methods:
frequency, MI, pLSA, LDA

缺点传统的特征表达方法经常忽略了上下文的信息和词序信息，以及语义信息。
高阶n-gram，tree kernels被应用在特征表达，但是也有稀疏的缺点，影响准确性。
word embedding： word2vec 能够捕捉更多语法和语义特征。

改进

Recursive Neural Network
优点：获取上下文信息。
缺点：①效果完全依赖于文本树的构建，并且构建文本树所需的时间是O(n^2). 并且两个句子的关系也不能通过一颗树表现出来。因此不适合与长句子或者文本。
②有偏的模型（biased model），后面的词占得重要性更大。这样不好，因为每个词都可能是重要的词。

Convolutional Neural Network(CNN)
优点：①时间复杂度：O（n）
②无偏的模型（unbiased model），能够通过最大池化获得最重要的特征。
③CNN卷积器的大小固定，如果选小了容易造成信息的丢失；如果选大了，会造成巨大的参数空间

提出：

Recurrent Neural Network (RecurrentNN)
循环结构–>捕获上下文信息
最大池化层—>提取最可能的特征，即哪个单词是哪个特征的key role
（原文说法：哪个单词是key role）

3.模型

在这里插入图片描述

1 构造词向量的链接模式

对于每个词 i， $c_l(w_i)$ 代表i的上文的向量， $c_r(w_i)$ 代表i的下文的向量，这两个向量由公式（1）（2）公式求出：
在这里插入图片描述
（其中对于所有的输入句子，第一个单词的 $c_l(w_1)$ 用一样的参数，原文说法：The left-side context for the first word in any document uses the same shared parameters $c_l(w_1)$ .）
然后对于每个单词：用公式（3）链接到一起
![在这里插入图片描述](https://img-blog.csdnimg.cn/20191101111605657.png

2.压缩链接向量

对每个单词在公式（3）获得的 $x_i$ 由下面的公式进行压缩，得到图中圈出来的2
在这里插入图片描述

3.最大池化层

在这里插入图片描述
上图中的的每一列中找出最大的，其实每一列对应的就是每种特征。然后组成 $y^{(3)}$ 。

为什么不用平均池化？
因为我们要找出句子中每个哪个单词最能代表某个特征，而不是求平均的特征值。原文：We do not use average pooling here because only a few words and their combination are useful for capturing the meaning of the document. The max-pooling layer attempts to find the most important latent semantic factors in the document.

最大池化层公式：（5）
在这里插入图片描述

4.特征加权和分类

特征加权在这里插入图片描述
softmax分类：

5.训练

所有需要训练的参数：
在这里插入图片描述
其中E是原始的embedding。（在该模型执行之前，已经进过了skip-gram进行求词向量，所以有E）

训练的目的：最大化如下公式
在这里插入图片描述

好像是2015年的论文。

you_jinpeng

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Recurrent Convolutional Neural Networks for Text Classification

1.Abstract传统：Traditional text classifiers often rely on many human-designed features, such as dictionaries, knowledge bases and special tree kernels.提出：a recurrent structure ——> capture contextu...
复制链接

扫一扫