非原创作品,转载自:http://blog.csdn.net/fyuanfena/article/details/52038984
Python真的超级超级好玩呐,不管是爬虫还是数据挖掘,真的都超级有意思。
今天,来说一说python一个好玩的模块wordcloud
构建词云的方法很多, 但是个人觉得python的wordcloud包功能最为强大,可以自定义图片.
官网: https://amueller.github.io/word_cloud/
github: https://github.com/amueller/word_cloud
例子:
![](https://img-blog.csdn.net/20160722141254820?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center)
字体用的是cabin-sketch.bold
安装
方法1
pip install wordcloud
方法2
github下载并解压
- wget https://github.com/amueller/word_cloud/archive/master.zip
- unzip master.zip
- rm master.zip
- cd word_cloud-master
安装依赖包
- sudo pip install -r requirements.txt
安装wordcloud
方法三
下载.whl文件http://www.lfd.uci.edu/~gohlke/pythonlibs/#wordcloud
使用cd命令进入whl文件的路径
运行这条命令:
- python -m pip install <filename>
以下是例子
-
- from os import path
- from scipy.misc import imread
- import matplotlib.pyplot as plt
- import jieba
-
- from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
-
- stopwords = {}
- def importStopword(filename=''):
- global stopwords
- f = open(filename, 'r', encoding='utf-8')
- line = f.readline().rstrip()
-
- while line:
- stopwords.setdefault(line, 0)
- stopwords[line] = 1
- line = f.readline().rstrip()
-
- f.close()
-
- def processChinese(text):
- seg_generator = jieba.cut(text)
-
- seg_list = [i for i in seg_generator if i not in stopwords]
-
- seg_list = [i for i in seg_list if i != u' ']
-
- seg_list = r' '.join(seg_list)
-
- return seg_list
-
- importStopword(filename='./stopwords.txt')
-
-
-
-
-
- d = path.dirname(__file__)
-
-
- text = open(path.join(d, 'love.txt'),encoding ='utf-8').read()
-
-
-
-
-
-
-
- back_coloring = imread(path.join(d, "./image/love.jpg"))
-
- wc = WordCloud( font_path='./font/cabin-sketch.bold.ttf',
- background_color="black",
- max_words=2000,
- mask=back_coloring,
- max_font_size=100,
- random_state=42,
- )
-
- wc.generate(text)
-
-
-
- image_colors = ImageColorGenerator(back_coloring)
-
- plt.figure()
-
- plt.imshow(wc)
- plt.axis("off")
- plt.show()
-
-
-
- wc.to_file(path.join(d, "名称.png"))
另外附上