Python最简单的方法生成词云图

最新推荐文章于 2024-06-16 18:30:53 发布

李挺老师

最新推荐文章于 2024-06-16 18:30:53 发布

阅读量8.5k

点赞数 13

分类专栏： python

本文链接：https://blog.csdn.net/andyleo0111/article/details/104634997

版权

python 专栏收录该内容

24 篇文章 7 订阅

订阅专栏

用Python怎么生成词云图呢？网上有很多教程，这里给大家介绍一种比较简单易懂的方式方法。

首先请自主下载worldcloud, jieba, imageio三个库。

如何有效下载详见：https://blog.csdn.net/andyleo0111/article/details/104532885

一. wordcloud库

1. 从字面意思来看我们就能知道，wordcloud(词云)是制作词云的核心库，也是必不可少的一个库。

2. WordCloud对象创建的常用参数。

3. WordCloud类的常用方法

1) generate(text): 由text文本生成词云

2) to_file(filename) : 将词云图保存为名为filename的文件

二. jieba库

三. imageio

imagio库是用来指定词云的填充形状的一个库，可以将词云设置成我们自己想要的形状。

实战演练

1. 生成一个普通的，英文的词云图。

  # 导入 worldcloud库
import wordcloud
  # 使用wordcloud库下的WordCloud类并设置参数
                        # 设置背景颜色为白色
a = wordcloud.WordCloud(background_color = 'white',\
                             # 设置字体为'msyh.ttc'，也就是微软雅黑字体
                             font_path = 'msyh.ttc',\
                             # 设置宽度为2000像素
                             width = 2000,\
                             # 设置高度为1500像素
                             height = 1500,\
                             # 设置最大词数为50词； # generate，根据字符串生成词云
                             max_words = 50).generate('My house is perfect. \
                             By great good fortune I have found a housekeeper \
                             light-footed woman of discreet age, strong and deft enough \
                             to render me all the service I require, and not \
                             She rises very early. By my breakfast-time there \
                             remains little to be done under the roof save dress\
                             Very rarely do I hear even a clink of crockery; never\
                             Oh, blessed silence! My house is perfect. \
                             Just large enough to allow the grace of order in domestic')
# 将生成的词云存为英文状态下的词云的文件
a.to_file('英文状态下的词云.jpg')

注意： \为转义字符，这里的作用是将转行的字符链接起来。

生成的词云为：

2. 生成一个中文内容的词云，这里我以txt文件展示。

这是一个名为“中国”的txt文件下面我们将它生成一个词云

特别注意：由于中文文本中的单词不是通过空格或者标点符号分割，首先需要进行分词处理，把一个句子划分成一个个中文词汇。

Python默认的字体无法做出中文的词云，只有指定字体才可以。

# 导入jieba库，进行分词处理
import jieba 
# 打开文件
with open('中国.text', 'r+', encoding = 'gbk') as file: 
    text = file.read()          # 读取文件内容
    word001 = jieba.lcut(text)  # 将文本进行分词处理
    word002 = " ".join(word)    # 将分好的词通过空格连接
    file.truncate()             # 清空文件原内容
    file.write(word002)         # 写入分好词的内容
    file.close()                # 关闭文件

分词处理好的内容为：

下面用分好词的内容生成词云，方法同上

import wordcloud
a = wordcloud.WordCloud(background_color = 'white',\
                             font_path = 'msyh.ttc',\
                             width = 2000,\
                             height = 1500,\
                             max_words = 50).generate(file)
a.to_file('中文状态下的词云.jpg')

# 注意添加文件的路径，为避免出错，可以再次打开file文件并选择read读入

生成的词云图如下：

2. 生成一个指定形状的的词云。

首先了解一下scipy.misc 和imageio的区别和联系

大家在一些教程上可能会看到scipy.misc的imread方法。但是现在使用这种方法并不能生成指定形状的词云，原因是：这个方法被废除了，现在只能通过imageio模块来调用，即from imageio import imread.

此外，一定要注意所选择的背景图片一定要轮廓清晰，背景最好为纯白色或没有颜色填充（也就是Ps扣过的图片），这样生成的效果最好。

下面，我分别用这张图片进行填充处理

import wordcloud
from imageio import imread
image = imread(r'C:\Users\Y520\Desktop\map.jpg')
with open(r'C:\Users\Y520\Desktop\中国.txt','r') as file:
    text = file.read()
    wd = wordcloud.WordCloud(background_color= 'white',\
                             font_path = 'msyh.ttc',\
                             width = 2000,\
                             height = 1500,\
                             mask = image,\
                             max_words = 150).generate(text)
wd.to_file('China_Map.jpg')

生成的图片为：

李挺老师

关注

13
点赞
踩
61

收藏

觉得还不错? 一键收藏
1
评论
Python最简单的方法生成词云图

用Python怎么生成词云图呢？网上有很多教程，这里给大家介绍一种比较简单易懂的方式方法。首先请自主下载worldcloud, jieba, imagio三个库。如何有效下载详见：https://blog.csdn.net/andyleo0111/article/details/104532885一. worldcloud库1. 从字面意思来看我们就能知道，worldcloud...
复制链接

扫一扫