对之前一节部分进行实践,使用keras进行实现,keras关于循环神经网络有多个方法。https://keras.io/zh/layers/recurrent/
- SimpleRNN
- LSTM+CNN
样本
使用ai挑战赛用户评论信息,这里仅用验证集的数据(数据量少一些,运行快一些)
标签取了用户消费后感受字段。该字段标签有1'正面情感', 0'中性情感', -1'负面情感', -2'情感倾向未提及'
评论内容 | 标签 |
---|---|
趁着国庆节,一家人在白天在山里玩耍之后,晚上决定吃李记搅团。 | 1 |
模型
1.导入库
import pandas as pd
import numpy as np
import json
import os
from keras.preprocessing.text import Tokenizer
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers.embeddings import Embedding
from keras.layers import Dropout, Conv1D, MaxPooling1D, LSTM, Dense, Bidirectional, SimpleRNN
from keras.utils import to_categorical
2.数据预处理
Tokenizer生成字典如下(截取部分):
{
",": 1, "的": 2, "。": 3, "是": 4, "不": 5, " ": 6, "了": 7, "一": 8, "有": 9, "很": 10, "吃": 11, "\n": 12, "好": 13, "点": 14, "还": 15, "个": 16, "味": 17, "菜": 18, "就": 19, "来": 20, "我": 21, "这": 22, "也": 23, "\"": 24, "人": 25, "!": 26, "大": 27,