wordcloud:
- 安装模块:
pip install wordcloud
- 基本使用:
WordCloud(font_path, background_color, width, height, max_words).generate(xxx)
font_path
:文本的字体collocations
:是否包含两个词的搭配,默认为true,所以会有重复的数据background_color
:背景色width
:幕布的宽度height
:幕布的高度max_words
:显示的最大词个数generate
:读取文本文件
- 案例:
from wordcloud import WordCloud
with open("xxx.txt", encoding="utf-8") as r:
txt = r.read()
wordcloud = WordCloud(font_path="xxx.ttf", collocations=False, background_color="black", width=800, height=600, max_words=50).generate(txt)
img = wordcloud.to_image()
img.show()
wordcloud.to_file("xxx.jpg")
jieba:
- 安装模块:
pip install jieba
- 基本格式:
jieba.analyse.extract_tags(xxx, topK, withWeight, allowPOS)
xxx
:需要处理的文本topK
:返回关键字的数量,重要性从高到低withWeight
:返回每个关键字的权重allowPOS
:需要提取的词性,n为名词、v为动词,传的值为元祖
- 案例:
import jieba.analyse
from wordcloud import WordCloud
text = ""
seg_list = jieba.analyse.extract_tags(text, allowPOS=("n", "v"))
txt_str = " ".join(seg_list)
wordcloud = WordCloud(font_path="xxx.ttf", collocations=False, background_color="black", width=800, height=600, max_words=50).generate(txt_str)
img = wordcloud.to_image()
img.show()
wordcloud.to_file("xxx.jpg")