数据是关于商品和酒店的评论,请利用下列方法对该数据的商品和酒店的评论进行情感分析,并评估模型的性能。
(1) 基于IF-IDF特征工程,分别采用逻辑回归、支持向量机和多项式朴素贝叶斯分析
import pandas as pd
import jieba
import re
from sklearn.model_selection import train_test_split
import model_evaluation_utils as meu
dataset = pd.read_csv(r'E:\python\python文本挖掘\作业7\DataSet.csv')
dataset.info()
dataset.dropna(inplace=True)
dataset.info()
with open(r"E:\python\python文本挖掘\作业7\stop_words.txt", encoding="utf8") as f:
stop_words = f.read()
def remove_special_characters(text):
pattern = re.compile(u'[^\u4E00-\u9FA5]')
text = pattern.sub('', text)
return text
def normalize_document(doc):