python+wordcloud，工作需要人生第一段python，词云

最新推荐文章于 2024-03-23 13:39:23 发布

威威不会玩

最新推荐文章于 2024-03-23 13:39:23 发布

阅读量195

点赞数

本文链接：https://blog.csdn.net/Golwii/article/details/95450327

版权

为啥网上找的词云计数，都把元素分成单个字了，还好我机智，用笨办法字典相加，
非程序猿工作用到可以来看啊，像我这样的搜百度能搜到（用的是anaconda），
努力写个界面让同事自己上传下载，好难
不知道分哪类。。。。
rreason.txt里的一段内容，分词保存到rrcut.txt和rrcut.jpg

from wordcloud import WordCloud
import jieba
import matplotlib.pyplot as plt
#import numpy as np
#from PIL import Image
import time

datapath = “E:\临时处理\”
with open(datapath + “rreason.txt”,‘r’,encoding=‘utf-8’) as f:
string_data = f.read()

text = " ".join(jieba.cut(string_data,cut_all=True))
llist = text.split()

remove_words = [u’了’,u’\n’,u’ ‘,u’，’,u’,’]

for word in llist:
for i in range(len(remove_words)):
if word == remove_words[i]:
llist.remove(word)

sett = set(llist)
dict = {}
for item in sett:
dict.update({item:llist.count(item)})
dic={}
for k,v in dict.items():
dic[k] = v

with open(datapath + “rrcut%s.txt”%time.strftime("%m%d %H",time.localtime()),‘w’,encoding=‘utf-8’) as f:
for d,v in dic.items():
f.write(d + “,” + str(v) + ‘\n’)

llist = " “.join(llist)
#mask = np.array(Image.open(“E:\临时处\4a0e74e7ly1g2muafkv0tj21o02yo7wh.jpg”))
cloud = WordCloud(font_path=”.\fonts\simhei.ttf",
collocations=False,
background_color=‘white’,
width=400,
height=300,
#mask=mask,
max_words=100,
max_font_size=50,
scale=8)

wcloud = cloud.generate(llist)
wcloud.to_file(datapath + “rrcut%s.jpg”%time.strftime("%m-%d %H",time.localtime()))
plt.imshow(wcloud,interpolation=‘bilinear’)
plt.axis(“off”)
plt.show()

remove_words列表为啥加个u
注释的是可以自己选择背景图片