jieba库词频统计

最新推荐文章于 2023-03-28 19:17:44 发布

木下瞳

最新推荐文章于 2023-03-28 19:17:44 发布

阅读量724

点赞数

分类专栏： Python模块使用文章标签：词频统计

本文链接：https://blog.csdn.net/zjkpy_5/article/details/81204329

版权

Python模块使用专栏收录该内容

51 篇文章 4 订阅

订阅专栏

了解更多关注微信公众号“木下学Python”吧~

import jieba.analyse
tags = jieba.analyse.extract_tags(content,topK=num,withWeight=True)
for item in tags:
print(item[0]+'\t'+str(item[1]*1000))

content为文本信息，topK为提取关键词的个数，withWeight为关键词的权重

分词常用函数

import jieba
import wordcloud


with open('liulangdiqiu.txt','r',encoding='utf-8') as f:
    txt = f.read()

ls = jieba.lcut(txt)
txt = ' '.join(ls)

w = wordcloud.WordCloud( width = 1000,height = 700,
                        font_path = 'msyh.ttc',)
w.generate(txt)
w.to_file('comments.png')