python打开文件要wordcloud吗,使用python创建wordcloud

日本PLAYISM

于 2021-03-26 10:52:52 发布

阅读量117

点赞数

文章标签： python打开文件要wordcloud吗

本文介绍了如何使用Python在清理文本文件后生成WordCloud，并展示了如何排除停用词和自定义字体。作者分享了代码实例，包括读取文本文件、处理停用词、生成词云以及使用特定字体，以实现更精细的可视化效果。

摘要由CSDN通过智能技术生成

我正在尝试在清理文本文件后在python中创建wordcloud,

我得到了所需的结果,即大多数在文本文件中使用但无法绘制的单词.

我的代码：

import collections

from wordcloud import WordCloud

import matplotlib.pyplot as plt

file = open('example.txt', encoding = 'utf8' )

stopwords = set(line.strip() for line in open('stopwords'))

wordcount = {}

for word in file.read().split():

word = word.lower()

word = word.replace(".","")

word = word.replace(",","")

word = word.replace("\"","")

word = word.replace("“","")

if word not in stopwords:

if word not in wordcount:

wordcount[word] = 1

else:

wordcount[word] += 1

d = collections.Counter(wordcount)

for word, count in d.most_common(10):

print(word , ":", count)

#wordcloud = WordCloud().generate(text)

#fig = plt.figure()

#fig.set_figwidth(14)

#fig.set_figheight(18)

#plt.imshow(wordcloud.recolor(color_func=grey_color, random_state=3))

#plt.title(title, color=fontcolor, size=30, y=1.01)

#plt.annotate(footer, xy=(0, -.025), xycoords='axes fraction', fontsize=infosize, color=fontcolor)

#plt.axis('off')

#plt.show()

编辑：

用以下代码绘制wordcloud：

wordcloud = WordCloud(background_color='white',

width=1200,

height=1000

).generate((d.most_common(10)))

plt.imshow(wordcloud)

plt.axis('off')

plt.show()

但是得到TypeError：预期的字符串或缓冲区

当我用.generate(str(d.most_common(10))尝试上述代码时

形成的单词云在几个单词之后显示’trotrophe(‘)符号

using Jupyter Notebook | python3 | Ipython

解决方法:

首先将此文件Symbola.ttf下载到以下脚本的当前文件夹中.

架构文件：

file.txt Symbola.ttf my_word_cloud.py

file.txt的：

foo buzz bizz foo buzz bizz foo buzz bizz foo buzz bizz foo buzz bizz

foo foo foo foo foo foo foo foo foo foo bizz bizz bizz bizz foo foo

my_word_cloud.py：

import io

from collections import Counter

from os import path

import matplotlib.pyplot as plt

from wordcloud import WordCloud

d = path.dirname(__file__)

# It is important to use io.open to correctly load the file as UTF-8

text = io.open(path.join(d, 'file.txt')).read()

words = text.split()

print(Counter(words))

# Generate a word cloud image

# The Symbola font includes most emoji

font_path = path.join(d, 'Symbola.ttf')

word_cloud = WordCloud(font_path=font_path).generate(text)

# Display the generated image:

plt.imshow(word_cloud)

plt.axis("off")

plt.show()

结果：

Counter({'foo': 17, 'bizz': 9, 'buzz': 5})

请参阅许多其他示例,在这里我为您创建了一个简单示例：

标签：word-cloud,python,matplotlib,plot

来源： https://codeday.me/bug/20191013/1905740.html

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。