Sklearn ValueError: empty vocabulary; perhaps the documents only contain stop words

最新推荐文章于 2023-10-23 10:38:29 发布

whieper

最新推荐文章于 2023-10-23 10:38:29 发布

阅读量9.5k

点赞数 5

分类专栏： NLP

本文链接：https://blog.csdn.net/qq_42208267/article/details/102603526

版权

NLP 专栏收录该内容

5 篇文章 3 订阅

订阅专栏

中文语料：

拆成单字的列表

荣耀内幕我不多
华为用户如果发现续航不足一天的请凭余总微博进行合理维权
便宜了 5 0 0 多 g

使用

CountVectorizer()

报错：

Sklearn ValueError: empty vocabulary; perhaps the documents only contain stop words

问题：

def __init__(self, input='content', encoding='utf-8',
             decode_error='strict', strip_accents=None,
             lowercase=True, preprocessor=None, tokenizer=None,
             stop_words=None, token_pattern=r"(?u)\b\w\w+\b",
             ngram_range=(1, 1), analyzer='word',
             max_df=1.0, min_df=1, max_features=None,
             vocabulary=None, binary=False, dtype=np.int64):

解决方案

CountVectorizer()默认analysis =“word”，改成CountVectorizer(analysis =“char”,lowercase=False)就好了

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

whieper

关注关注

5
点赞
踩
5

收藏

觉得还不错? 一键收藏
9
评论
Sklearn ValueError: empty vocabulary; perhaps the documents only contain stop words

中文语料：拆成单字的列表荣耀内幕我不多华为用户如果发现续航不足一天的请凭余总微博进行合理维权便宜了 5 0 0 多 g使用CountVectorizer()报错：Sklearn ValueError: empty vocabulary; perhaps the documents only contai...
复制链接

扫一扫