sklearn——朴素贝叶斯文本分类4

最新推荐文章于 2023-01-12 06:00:00 发布

panghaomingme

最新推荐文章于 2023-01-12 06:00:00 发布

阅读量797

点赞数

分类专栏： Scikit Learn

本文链接：https://blog.csdn.net/panghaomingme/article/details/55510089

版权

Scikit Learn 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

把数据去掉'headers', 'footers', 'quotes'，准确率反而降低了

from sklearn.datasets import fetch_20newsgroups
news=fetch_20newsgroups(subset='all',remove=('headers', 'footers', 'quotes'))
from sklearn.cross_validation import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(news.data,news.target,test_size=0.25)
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf=TfidfVectorizer()
X_tfidf_train=tfidf.fit_transform(X_train)
X_tfidf_test=tfidf.transform(X_test)
from sklearn.naive_bayes import MultinomialNB
mnb_tfidf=MultinomialNB()
mnb_tfidf.fit(X_tfidf_train,Y_train)
print(mnb_tfidf.score(X_tfidf_test,Y_test))

去掉 'headers', 'footers', 'quotes'之后数据集就变成这样了

A "moment of silence" doesn't mean much unless *everyone*
participates.  Otherwise it's not silent, now is it?

Non-religious reasons for having a "moment of silence" for a dead
classmate: (1) to comfort the friends by showing respect to the
deceased , (2) to give the classmates a moment to grieve together, (3)
to give the friends a moment to remember their classmate *in the
context of the school*, (4) to deal with the fact that the classmate
is gone so that it's not disruptive later.

Blindly opposing everything with a flavor of religion in it is
utterly idiotic.

结果：