实验任务:利用不同的深度学习框架对微博短文本进行情感分析,并将情感分为三类,分别是正、负、中。
所用语言及相应的工具包:Python 3.6, Keras 2.2.4, Torch 1.0.1
数据分布: {'pos': 712, 'neu': 768, 'neg': 521}
技术路线:
本次实验利用词向量来表示文本,单条文本的形状为[50, 100].
利用Keras对处理好的文本进行情感识别:
import pickle
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Bidirectional, LSTM, GRU
from keras.callbacks import TensorBoard
def load_f(path):
with open(path, 'rb')as f:
data = pickle.load(f)
return data
path_1 = r'G:/Multimodal/nlp/w2v_weibo_data.pickle'
path_2 = r'G:/Multimodal/labels.pickle'
txts = load_f(path_1)
labels = load_f(path_2)
train_X, test_X, train_Y, test_Y = train_test_split(txts, labels, test_size= 0.2, random_state= 46)
#build model;
tensorboard = TensorBoard(log_dir= r'G:\pytorch')
model = Sequential()
model.add(LSTM(128, input_shape = (None, 100)))
#model.add(GRU(128, input_shape = (None, 100)))
model.add(Dense(3, activation= 'softmax'))
model.compile(loss =