python打开文件要wordcloud吗,使用python创建wordcloud

本文介绍了如何使用Python在清理文本文件后生成WordCloud,并展示了如何排除停用词和自定义字体。作者分享了代码实例,包括读取文本文件、处理停用词、生成词云以及使用特定字体,以实现更精细的可视化效果。
摘要由CSDN通过智能技术生成

我正在尝试在清理文本文件后在python中创建wordcloud,

我得到了所需的结果,即大多数在文本文件中使用但无法绘制的单词.

我的代码:

import collections

from wordcloud import WordCloud

import matplotlib.pyplot as plt

file = open('example.txt', encoding = 'utf8' )

stopwords = set(line.strip() for line in open('stopwords'))

wordcount = {}

for word in file.read().split():

word = word.lower()

word = word.replace(".","")

word = word.replace(",","")

word = word.replace("\"","")

word = word.replace("“","")

if word not in stopwords:

if word not in wordcount:

wordcount[word] = 1

else:

wordcount[word] += 1

d = collections.Counter(wordcount)

for word, count in d.most_common(10):

print(word , ":", count)

#wordcloud = WordCloud().generate(text)

#fig = plt.figure()

#fig.set_figwidth(14)

#fig.set_figheight(18)

#plt.imshow(wordcloud.recolor(color_func=grey_color, random_state=3))

#plt.title(title, color=fontcolor, size=30, y=1.01)

#plt.annotate(footer, xy=(0, -.025), xycoords='axes fraction', fontsize=infosize, color=fontcolor)

#plt.axis('off')

#plt.show()

编辑:

用以下代码绘制wordcloud:

wordcloud = WordCloud(background_color='white',

width=1200,

height=1000

).generate((d.most_common(10)))

plt.imshow(wordcloud)

plt.axis('off')

plt.show()

但是得到TypeError:预期的字符串或缓冲区

当我用.generate(str(d.most_common(10))尝试上述代码时

形成的单词云在几个单词之后显示’trotrophe(‘)符号

using Jupyter Notebook | python3 | Ipython

解决方法:

首先将此文件Symbola.ttf下载到以下脚本的当前文件夹中.

架构文件:

file.txt Symbola.ttf my_word_cloud.py

file.txt的:

foo buzz bizz foo buzz bizz foo buzz bizz foo buzz bizz foo buzz bizz

foo foo foo foo foo foo foo foo foo foo bizz bizz bizz bizz foo foo

my_word_cloud.py:

import io

from collections import Counter

from os import path

import matplotlib.pyplot as plt

from wordcloud import WordCloud

d = path.dirname(__file__)

# It is important to use io.open to correctly load the file as UTF-8

text = io.open(path.join(d, 'file.txt')).read()

words = text.split()

print(Counter(words))

# Generate a word cloud image

# The Symbola font includes most emoji

font_path = path.join(d, 'Symbola.ttf')

word_cloud = WordCloud(font_path=font_path).generate(text)

# Display the generated image:

plt.imshow(word_cloud)

plt.axis("off")

plt.show()

结果:

Counter({'foo': 17, 'bizz': 9, 'buzz': 5})

lAu9B.png

请参阅许多其他示例,在这里我为您创建了一个简单示例:

标签:word-cloud,python,matplotlib,plot

来源: https://codeday.me/bug/20191013/1905740.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值