Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )

最新推荐文章于 2022-05-18 12:53:46 发布

colorful_-_

最新推荐文章于 2022-05-18 12:53:46 发布

阅读量745

点赞数

分类专栏： Aspect sentiment classfication 文本分类文章标签： nlp

本文链接：https://blog.csdn.net/weixin_43589681/article/details/104779066

版权

Aspect sentiment classfication 同时被 2 个专栏收录

3 篇文章 0 订阅

订阅专栏

文本分类

3 篇文章 0 订阅

订阅专栏

这篇文章主要记录对**Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )**论文的理解，主要说明其模型。
该模型提出了一个双层的Attention网络基于aspect word做分类，双层的Attention首先从句子中学习aspect信息，然后基于aspect和从句子中提取的aspect信息，关注特定的情感信息。如句子：

给定aspect词food，双层的Attention模型首先基于“food”关注单词“tastes”（aspect terms），之后基于aspect词"food"和“tastes”，找到词"great"。这样基于aspect terms，能更好的确定给定aspect的情感倾向。

一 Model

1.1 HEAT 网络结构

其结构图如下所示：
在这里插入图片描述
Input Model：输入模块将句子和aspect词编码为向量的形式
Hierarchical Attention Model：使用两层attenton获取aspect information（aspect attention层）和aspect-specfic sentiment information（sentiment attention层）
Sentiment Classfication Model：情感分类

1.2 Input Model

使用双向GRU模型学习句子的向量表示，其主要定义如下：
GRU
我们令：

1.3 Hierarchical Attention Model

Aspect Attention
Aspect Attention找到可能的aspect terms，其输入是
在这里插入图片描述
attention机制基于给定的aspect表示和句子的特征表示计算每个词的权重：

故最终句子的aspect information是对特征的权重累加：

Sentiment attention
Sentiment attention基于aspect词和aspect information提取句子的情感特征。与aspect attention类似，其输入是BiGRU的输出
在这里插入图片描述
由于aspect information和sentiment information需要不同的特征，所以这两个GRU模型不共享参数。
之后基于句子的特征向量、aspect特征以及句子的aspect特征计算每个词的attention分数：

为了更好的计算attention权重，文章中考虑了aspect terms的局部信息（离aspect terms更近的情感词比远的要更重要）。使用location mask layer关注aspect terms的局部信息。用一个局部矩阵来实现：
在这里插入图片描述
这样离aspect term更近的词会有更大的权重，故sentiment attention分数计算为：

基于给定aspect句子的情感特征是句子特征的权重累加：

1.4 Setiment Classfication Model

在这里插入图片描述

二核心代码

class HEAT(nn.Module):
    def __init__(self, word_embed_dim, output_size, vocab_size, aspect_size, args=None):
        super(HEAT, self).__init__()

        self.input_size = word_embed_dim if (args.use_elmo == 0) else ( word_embed_dim + 1024 if args.use_elmo == 1 else 1024)
        self.hidden_size = args.n_hidden
        self.output_size = output_size
        self.max_length = 1
        self.lr = 0.0005

        self.word_rep = WordRep(vocab_size, word_embed_dim, None, args)
        self.rnn_a = nn.GRU(self.input_size, self.hidden_size // 2, bidirectional=True)
        self.AE = nn.Embedding(aspect_size, word_embed_dim)

        self.W_h_a = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_v_a = nn.Linear(word_embed_dim, self.input_size)
        self.w_a = nn.Linear(self.hidden_size + word_embed_dim, 1)
        self.W_p_a = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_x_a = nn.Linear(self.hidden_size, self.hidden_size)

        self.rnn_p = nn.GRU(self.input_size, self.hidden_size // 2, bidirectional=True)

        self.W_h = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_v = nn.Linear(word_embed_dim+self.hidden_size, word_embed_dim+self.hidden_size)
        self.w = nn.Linear(2*self.hidden_size + word_embed_dim, 1)
        self.W_p = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_x = nn.Linear(self.hidden_size, self.hidden_size)

        self.decoder_p = nn.Linear(self.hidden_size+word_embed_dim, output_size)  
        self.dropout = nn.Dropout(args.dropout)
        self.optimizer = torch.optim.Adam(self.parameters(), lr=self.lr)

    def forward(self, input_tensors):
        assert len(input_tensors) == 3
        aspect_i = input_tensors[2]
        #得到句子的特征表示
        sentence = self.word_rep(input_tensors)
        #句子的长度
        length = sentence.size()[0]
        #两个GRU:一个用于Aspect attention;一个用于Sentiment attention
        output_a, hidden = self.rnn_a(sentence)
        output_p, _ = self.rnn_p(sentence)
        #[length,128]
        output_a = output_a.view(output_a.size()[0], -1)
        output_p = output_p.view(length, -1)
     
        #主题词的特征向量表示[1,200]
        aspect_e = self.AE(aspect_i)
        aspect_embedding = aspect_e.view(1, -1)
        
        #[length,200]把主题词扩大成句子的向量
        aspect_embedding = aspect_embedding.expand(length, -1)
        #得到aspect对于句子中每一词的权重[length,428]
        M_a = F.tanh(torch.cat((output_a, aspect_embedding), dim=1))
        #[1,length]
        weights_a = F.softmax(self.w_a(M_a), dim=0).t()
        # 得到基于主题词的句子aspect information[1,128]
        r_a = torch.matmul(weights_a, output_a)
        
        #sentiment attention
        #[length,128]
        r_a_expand = r_a.expand(length, -1)

        #[length,328]
        query4PA = torch.cat((r_a_expand, aspect_embedding), dim=1)

        #[length,456]
        M_p = F.tanh(torch.cat((output_p, query4PA), dim=1))
        #[length,1]
        g_p = self.w(M_p)
        # print(g_p)

        weights_p = F.softmax(g_p, dim=0).t()

        #sentiment feature
        r_p = torch.matmul(weights_p, output_p)
        r = torch.cat((r_p, aspect_e), dim=1)

        #输出
        decoded = self.decoder_p(r)
        ouput = decoded
        return ouput

colorful_-_

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )

这篇文章主要记录对**Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )**论文的理解，主要说明其模型。该模型提出了一个双层的Attention网络基于aspect word做分类，双层的Attention首先从句子中学习aspect信息，然后基于aspect和从句子中提取的aspect信息，关注特...
复制链接

扫一扫