使用朴素贝叶斯对新闻进行自动分类。
导包
import numpy as np
import pandas as pd
import jieba
读入数据,删除空值。
df_techology=pd.read_csv(r'bayes/technology_news.csv')
df_techology=df_techology.dropna()
df_car=pd.read_csv(r'bayes/car_news.csv')
df_car=df_car.dropna()
df_entertainment=pd.read_csv(r'bayes/entertainment_news.csv')
df_entertainment=df_entertainment.dropna()
df_military=pd.read_csv(r'bayes/military_news.csv')
df_military=df_military.dropna()
df_sports=pd.read_csv(r'bayes/sports_news.csv')
df_sports=df_sports.dropna()
将每类新闻中的每条新闻放入一个列表中
technology=df_techology.content.values.tolist()
car=df_car.content.values.tolist()
entertainment=df_entertainment.content.values.tolist()
military=df_military.content.values.tolist()
sports=df_sports.content.values.tolist