数据示例如下所示,
UserName,ScreenName,Location,TweetAt,OriginalTweet,Sentiment
3799,48751,London,16-03-2020,@MeNyrbie @Phil_Gahan @Chrisitv https://t.co/iFz9FAn2Pa and https://t.co/xX6ghGFzCC and https://t.co/I2NlzdxNo8,Neutral
3800,48752,UK,16-03-2020,advice Talk to your neighbours family to exchange phone numbers create contact list with phone numbers of neighbours schools employer chemist GP set up online shopping accounts if poss adequate supplies of regular meds but not over order,Positive
3801,48753,Vagabonds,16-03-2020,"Coronavirus Australia: Woolworths to give elderly, disabled dedicated shopping hours amid COVID-19 outbreak https://t.co/bInCA9Vp8P",Positive
3802,48754,,16-03-2020,"My food stock is not the only one which is empty...
... ...
数据读取如下所示,
train_path = r'./train.csv'
test_path = r'./test.csv'
test_data = pd.read_csv(test_path)
test_data = test_data.OriginalTweet
# 读取训练集数据
# 将情绪标签转化为数值标签
def label_numeric_transfer(labels):
label_names = ['Neutral', 'Positive', 'Negative', 'Extremely Positive', 'Extremely Negative']
numeric_labels = []
for label in labels:
if label not in label_names:
print('WARNING! Invalid label named {}'.format(label))
if label == label_names[0]:
numeric_labels.append(0)
elif label == label_names[1]:
numeric_labels.append(1)
elif label == label_names[2]:
numeric_labels.append(2)
elif label == label_names[3]:
numeric_labels.append(3)
elif