《昇思25天学习打卡营第19天》

最新推荐文章于 2024-07-29 15:54:17 发布

NewtoPhone

最新推荐文章于 2024-07-29 15:54:17 发布

阅读量478

点赞数 11

文章标签：学习

本文链接：https://blog.csdn.net/NewtoPhone/article/details/140579663

版权

今天我们要进行学习的内容是RNN实现情感分类

概述：情感分类是自然语言处理中的经典任务，是典型的分类问题。

解释：情感分类是用于识别主观文本中的情感倾向的技术，包括积极、中性、消极三类。它在用户消费习惯分析、危机舆情监控等方面有重要应用。

在开始之前我们要进行数据的准备

Review	Label
"Quitting" may be as much about exiting a pre-ordained identity as about drug withdrawal. As a rural guy coming to Beijing, class and success must have struck this young artist face on as an appeal to separate from his roots and far surpass his peasant parents' acting success. Troubles arise, however, when the new man is too new, when it demands too big a departure from family, history, nature, and personal identity. The ensuing splits, and confusion between the imaginary and the real and the dissonance between the ordinary and the heroic are the stuff of a gut check on the one hand or a complete escape from self on the other.	Negative
This movie is amazing because the fact that the real people portray themselves and their real life experience and do such a good job it's like they're almost living the past over again. Jia Hongsheng plays himself an actor who quit everything except music and drugs struggling with depression and searching for the meaning of life while being angry at everyone especially the people who care for him most.	Positive

Review

Label

"Quitting" may be as much about exiting a pre-ordained identity as about drug withdrawal. As a rural guy coming to Beijing, class and success must have struck this young artist face on as an appeal to separate from his roots and far surpass his peasant parents' acting success. Troubles arise, however, when the new man is too new, when it demands too big a departure from family, history, nature, and personal identity. The ensuing splits, and confusion between the imaginary and the real and the dissonance between the ordinary and the heroic are the stuff of a gut check on the one hand or a complete escape from self on the other.

Negative

This movie is amazing because the fact that the real people portray themselves and their real life experience and do such a good job it's like they're almost living the past over again. Jia Hongsheng plays himself an actor who quit everything except music and drugs struggling with depression and searching for the meaning of life while being angry at everyone especially the people who care for him most.

Positive

设计数据下载模块

加载IMDB数据集

代码如下

import re
import six
import string
import tarfile

class IMDBData():
    """IMDB数据集加载器

    加载IMDB数据集并处理为一个Python迭代对象。

    """
    label_map = {
        "pos": 1,
        "neg": 0
    }
    def __init__(self, path, mode="train"):
        self.mode = mode
        self.path = path
        self.docs, self.labels = [], []

        self._load("pos")
        self._load("neg")

    def _load(self, label):
        pattern = re.compile(r"aclImdb/{}/{}/.*\.txt$".format(self.mode, label))
        # 将数据加载至内存
        with tarfile.open(self.path) as tarf:
            tf = tarf.next()
            while tf is not None:
                if bool(pattern.match(tf.name)):
                    # 对文本进行分词、去除标点和特殊字符、小写处理
                    self.docs.append(str(tarf.extractfile(tf).read().rstrip(six.b("\n\r"))
                                         .translate(None, six.b(string.punctuation)).lower()).split())
                    self.labels.append([self.label_map[label]])
                tf = tarf.next()

    def __getitem__(self, idx):
        return self.docs[idx], self.labels[idx]

    def __len__(self):
        return len(self.docs)

加载预训练词向量

预训练词向量是对输入单词的数值化表示，通过nn.Embedding层，采用查表的方式，输入单词对应词表中的index，获得对应的表达向量。

如图

文本末尾附上打卡时间

NewtoPhone

关注

11
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
《昇思25天学习打卡营第19天》

今天我们要进行学习的内容是RNN实现情感分类概述：情感分类是自然语言处理中的经典任务，是典型的分类问题。解释：情感分类是用于识别主观文本中的情感倾向的技术，包括积极、中性、消极三类。它在用户消费习惯分析、危机舆情监控等方面有重要应用。在开始之前我们要进行数据的准备设计数据下载模块。
复制链接

扫一扫