使用BERT进行情感分析

睡熊猛醒

于 2019-12-28 21:35:19 发布

阅读量4.1k

点赞数

分类专栏：语言模型情感分析

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_41089007/article/details/103748877

版权

年前最后一篇，就写个自己使用BERT的流程步骤，提前祝大家新年快乐~

## STEP1：构建模型
class Config(object):
    """配置参数"""

    def __init__(self, dataset):
        self.model_name = 'bert'
        self.train_path = dataset + '/data/train.txt'  # 训练集
        self.dev_path = dataset + '/data/dev.txt'  # 验证集
        self.test_path = dataset + '/data/test.txt'  # 测试集
        self.class_list = [x.strip() for x in open(
            dataset + '/data/class.txt').readlines()]  
        self.save_path = dataset + '/saved_dict/' + self.model_name + '.ckpt' 
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')  

        self.require_improvement = 1000  # 若超过1000batch效果还没提升，则提前结束训练
        self.num_classes = len(self.class_list)  
        self.num_epochs = 3  
        self.batch_size = 128  
        self.pad_size = 32  
        self.learning_rate = 5e-5 
        self.bert_path = './bert_pretrain'
        self.tokenizer = BertTokenizer.from_pretrained(self.bert_path)
        self.hidden_size = 768


class BERT(nn.Module):

    def __init__(self, config):
        super(BERT, self).__init__()
        self.bert = BertModel.from_pretrained(config.bert_path)
        for param in self.bert.parameters():
            param.requires_grad = True
        self.fc = nn.Linear(config.hidden_size, config.num_classes)

    def forward(self, x):
        context = x[0]  # 输入的句子
        mask = x[2]  # 对padding部分进行mask，和句子一个size，padding部分用0表示
        _, pooled = self.bert(context, attention_mask=mask, output_all_encoded_layers=False)
        out = self.fc(pooled)
        return out

## STEP2：构建数据集
def build_dataset(config):
    def load_dataset(path, pad_size=32):
        contents = []
        with open(path, 'r', encoding='UTF-8') as f:
            for line in tqdm(f):
                lin = line.strip()
                if not lin:
                    continue
                content, label = lin.split('\t')
                tok

最低0.47元/天解锁文章

关注

0
点赞
踩
30

收藏

觉得还不错? 一键收藏
6
评论
使用BERT进行情感分析

年前最后一篇，就写个自己使用BERT的流程步骤，提前祝大家新年快乐~## STEP1：构建模型class Config(object): """配置参数""" def __init__(self, dataset): self.model_name = 'bert' self.train_path = dataset + '/data/tr...
复制链接

扫一扫

专栏目录

睡熊猛醒 CSDN认证博客专家 CSDN认证企业博客

码龄7年

39: 原创

5万+: 周排名

164万+: 总排名

21万+: 访问

: 等级

2282: 积分

68: 粉丝

173: 获赞

67: 评论

796: 收藏

私信

关注

热门文章

分类专栏

最新评论

Key Fact as Pivot: A Two-Stage Model for Low Resource Table-to-Text Generation 论文代码解析
Zoomaaaaa: 请问这个文章的数据哪里有呢？
FastText原理以及pytorch简单实现
热血老男孩: 确定fasttext是用relu激活函数？原文不是这样吧，原文是层次softmax
阿里巴巴笔试题：数据分析与建模测试
叶老师讲大数据: 请问这个数据集在哪里下载？
GPT-3：Language Models are Few-Shot Learners 论文解读
春夏秋冬又一年: 来这里看看相关论文解读？ https://www.webhub123.com/#/home/detail?projectHashid=13892800&ownerUserid=27786724
使用BERT进行情感分析
扯恒广东旗: 为什么不能修改pad_size和batch_size,修改就会报错 File "E:\study\pycharm_project\bert_cnn\Bruce-Bert-Text-Classification\models\BruceBert.py", line 65, in forward _, pooled = self.bert(context, attention_mask=mask, output_all_encoded_layers=True) #shape [128,768] File "E:\study\pycharm_project\mental_bert\bert\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "E:\study\pycharm_project\bert_cnn\Bruce-Bert-Text-Classification\pytorch_pretrained\modeling.py", line 730, in forward embedding_output = self.embeddings(input_ids, token_type_ids) File "E:\study\pycharm_project\mental_bert\bert\lib\site-packages\torch\nn\modules\module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "E:\study\pycharm_project\bert_cnn\Bruce-Bert-Text-Classification\pytorch_pretrained\modeling.py", line 261, in forward seq_length = input_ids.size(1) IndexError: Dimension out of range (expected to be in range of

您愿意向朋友推荐“博客详情页”吗？

强烈不推荐
不推荐
一般般
推荐
强烈推荐

提交

最新文章

目录

评论 6

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。