无意中测试了一下paddlehub的情感分析的(唯一的三分类模型)预训练模型emotion_detection_textcnn,发现英文的准确度好像没有中文高?
同一句话的中英文测试:
import paddlehub as hub
module = hub.Module(name = "emotion_detection_textcnn")#模型加载
test_text = ["你真丑", "You're so ugly"]
input_dict = {"text": test_text} #文字输入
results = module.emotion_classify(data = input_dict) #预测结果
for result in results:
print(result['text'])
print(result['emotion_label'])
print(result['emotion_key'])
probs_name = result['emotion_key'] + "_probs"
#print(result['negative_probs'])
#print(probs_name)
print(result[probs_name])
#你真丑
#0
#negative
#0.9627
#You're so ugly
#1
#neutral
#0.9766
本来是贬义的句子被误判为中性~
Kaggle的Google Play数据集含有APP的评论信息(包含:“APP”:APP的名称,“Translated_Review”:用户评论(已预处理并翻译成英文),“Sentiment”:情感分为:积极/消极/中性,“Sentiment_Polarity”:情绪极性得分,“Sentiment_Subjectivity”:情绪主观性得分)