1.直接在python中加载
from sklearn.datasets import fetch_20newsgroups #导入模块
news_data = fetch_20newsgroups(subset="all") #加载数据
from sklearn.model_selection import train_test_split #导入模块
x_train,x_test,y_train,y_test = train_test_split(news_data.data,news_data.target,test_size=0.25) #划分训练、测试数据集
2.手动下载数据集
直接在网站中下载:
http://qwone.com/~jason/20Newsgroups/
点击即可下载对应的数据集