提前祝各位节日快乐, 用wordcloud画个词云;
原图:
最终图:
-
font_path:string 字体文件路径,文件后缀名为OTF或TTF
-
width:int 帆布的宽度,默认400
-
height:int 帆布的高度,默认200
-
prefer_horizontal: float 尝试水平拟合与垂直拟合的次数之比。如果prefer_horizontal < 1,算法将尝试旋转单词。默认为0.9
-
mask:nd-array or None 可以添加背景图片的,默认为None
-
contour_width:float 如果mask不为None,且contour_width > 0,绘制mask轮廓,默认为0
-
contour_color:color value mask轮廓的颜色,默认为"black",设置此项记得把mode换为"RGB"
-
scale:float 计算和绘图之间的缩放。对于大型的文字云图像,使用比例而不是更大的画布尺寸明显更快,但是可能会让这个词更粗。
-
min_font_size:int 设置词云中最小字的字体大小,默认为4
-
max_font_size:int 设置词云中最大字的字体大小
-
font_step:int 字体的步长。Font_step > 1可能会加快计算速度,但是可能效果不会很好
-
max_words: number 词的数量限制
-
stopwords: string or None 设置停用词
-
background_color:color value 词云图像的背景颜色,默认为’black’
-
mode:string 当模式为“RGBA”时,将生成透明背景,此时background_color不应设置,默认为"RGB"
-
relative_scaling:float 考虑单词词频,relative_scaling=0只考虑单词级别,relative_scaling=1,一个单词的频率是2倍,其大小就为2倍,如果你想要考虑单词的频率而不只是他们的排名,相对比例在0.5左右通常看起来不错。如果’auto’,它将被设置为0.5,除非repeat为真,在这种情况下,它将被设置为0。
-
color_func: 可调用的参数为word, font_size, position, orientation,
font_path, random_state为每个单词返回PIL颜色。
覆盖“colormap”。
请参见colormap来指定一个matplotlib colormap。
要创建一个单一颜色的词云,使用
color_func=lambda *args, **kwargs: "white"
。
单色也可以使用RGB代码指定。例如
color_func=lambda *args, **kwargs:(255,0,0)
设置颜色为红色。 -
regexp:string or None
正则表达式将输入文本分割为process_text中的令牌。如果指定None,则使用’ r ’ “\w[\w’]+” ’ '。如果使用generate_from_frequencies则忽略 -
collocations:bool 是否包含了2个单词的搭配使用,如果使用了generate_from_frequencies则忽略,默认为True
-
Colormap: string or matplotlib Colormap, default=“viridis” Matplotlib colormap随机为每个单词绘制颜色。如果指定了“color_func”,则忽略。
-
normalize_plurals: bool,默认=True
是否从单词中去掉后缀’s’。如果是真的,还有一句话出现时带有或不带末尾’s’,带末尾’s’的
将被删除,其计数将添加到不含以s结尾——除非单词以ss结尾。如果使用了generate_from_frequencies则忽略 -
repeat bool 是否重复单词和短语,直到达到max_words或min_font_size,默认为False
-
include_numbers: 是否将数字作为短语使用,默认为False
-
min_word_length:int 一个单词必须包含最少的字母数,默认为0
contour_color、background_color样式:
"aliceblue": "#f0f8ff",
"antiquewhite": "#faebd7",
"aqua": "#00ffff",
"aquamarine": "#7fffd4",
"azure": "#f0ffff",
"beige": "#f5f5dc",
"bisque": "#ffe4c4",
"black": "#000000",
"blanchedalmond": "#ffebcd",
"blue": "#0000ff",
"blueviolet": "#8a2be2",
"brown": "#a52a2a",
"burlywood": "#deb887",
"cadetblue": "#5f9ea0",
"chartreuse": "#7fff00",
"chocolate": "#d2691e",
"coral": "#ff7f50",
"cornflowerblue": "#6495ed",
"cornsilk": "#fff8dc",
"crimson": "#dc143c",
"cyan": "#00ffff",
"darkblue": "#00008b",
"darkcyan": "#008b8b",
"darkgoldenrod": "#b8860b",
"darkgray": "#a9a9a9",
"darkgrey": "#a9a9a9",
"darkgreen": "#006400",
"darkkhaki": "#bdb76b",
"darkmagenta": "#8b008b",
"darkolivegreen": "#556b2f",
"darkorange": "#ff8c00",
"darkorchid": "#9932cc",
"darkred": "#8b0000",
"darksalmon": "#e9967a",
"darkseagreen": "#8fbc8f",
"darkslateblue": "#483d8b",
"darkslategray": "#2f4f4f",
"darkslategrey": "#2f4f4f",
"darkturquoise": "#00ced1",
"darkviolet": "#9400d3",
"deeppink": "#ff1493",
"deepskyblue": "#00bfff",
"dimgray": "#696969",
"dimgrey": "#696969",
"dodgerblue": "#1e90ff",
"firebrick": "#b22222",
"floralwhite": "#fffaf0",
"forestgreen": "#228b22",
"fuchsia": "#ff00ff",
"gainsboro": "#dcdcdc",
"ghostwhite": "#f8f8ff",
"gold": "#ffd700",
"goldenrod": "#daa520",
"gray": "#808080",
"grey": "#808080",
"green": "#008000",
"greenyellow": "#adff2f",
"honeydew": "#f0fff0",
"hotpink": "#ff69b4",
"indianred": "#cd5c5c",
"indigo": "#4b0082",
"ivory": "#fffff0",
"khaki": "#f0e68c",
"lavender": "#e6e6fa",
"lavenderblush": "#fff0f5",
"lawngreen": "#7cfc00",
"lemonchiffon": "#fffacd",
"lightblue": "#add8e6",
"lightcoral": "#f08080",
"lightcyan": "#e0ffff",
"lightgoldenrodyellow": "#fafad2",
"lightgreen": "#90ee90",
"lightgray": "#d3d3d3",
"lightgrey": "#d3d3d3",
"lightpink": "#ffb6c1",
"lightsalmon": "#ffa07a",
"lightseagreen": "#20b2aa",
"lightskyblue": "#87cefa",
"lightslategray": "#778899",
"lightslategrey": "#778899",
"lightsteelblue": "#b0c4de",
"lightyellow": "#ffffe0",
"lime": "#00ff00",
"limegreen": "#32cd32",
"linen": "#faf0e6",
"magenta": "#ff00ff",
"maroon": "#800000",
"mediumaquamarine": "#66cdaa",
"mediumblue": "#0000cd",
"mediumorchid": "#ba55d3",
"mediumpurple": "#9370db",
"mediumseagreen": "#3cb371",
"mediumslateblue": "#7b68ee",
"mediumspringgreen": "#00fa9a",
"mediumturquoise": "#48d1cc",
"mediumvioletred": "#c71585",
"midnightblue": "#191970",
"mintcream": "#f5fffa",
"mistyrose": "#ffe4e1",
"moccasin": "#ffe4b5",
"navajowhite": "#ffdead",
"navy": "#000080",
"oldlace": "#fdf5e6",
"olive": "#808000",
"olivedrab": "#6b8e23",
"orange": "#ffa500",
"orangered": "#ff4500",
"orchid": "#da70d6",
"palegoldenrod": "#eee8aa",
"palegreen": "#98fb98",
"paleturquoise": "#afeeee",
"palevioletred": "#db7093",
"papayawhip": "#ffefd5",
"peachpuff": "#ffdab9",
"peru": "#cd853f",
"pink": "#ffc0cb",
"plum": "#dda0dd",
"powderblue": "#b0e0e6",
"purple": "#800080",
"rebeccapurple": "#663399",
"red": "#ff0000",
"rosybrown": "#bc8f8f",
"royalblue": "#4169e1",
"saddlebrown": "#8b4513",
"salmon": "#fa8072",
"sandybrown": "#f4a460",
"seagreen": "#2e8b57",
"seashell": "#fff5ee",
"sienna": "#a0522d",
"silver": "#c0c0c0",
"skyblue": "#87ceeb",
"slateblue": "#6a5acd",
"slategray": "#708090",
"slategrey": "#708090",
"snow": "#fffafa",
"springgreen": "#00ff7f",
"steelblue": "#4682b4",
"tan": "#d2b48c",
"teal": "#008080",
"thistle": "#d8bfd8",
"tomato": "#ff6347",
"turquoise": "#40e0d0",
"violet": "#ee82ee",
"wheat": "#f5deb3",
"white": "#ffffff",
"whitesmoke": "#f5f5f5",
"yellow": "#ffff00",
"yellowgreen": "#9acd32",
colormap样式:
'Accent', 'Accent_r', 'Blues', 'Blues_r', 'BrBG', 'BrBG_r', 'BuGn', 'BuGn_r', 'BuPu', 'BuPu_r', 'CMRmap', 'CMRmap_r', 'Dark2',
'Dark2_r', 'GnBu', 'GnBu_r', 'Greens', 'Greens_r', 'Greys', 'Greys_r', 'OrRd', 'OrRd_r', 'Oranges', 'Oranges_r', 'PRGn', 'PRGn_r',
'Paired', 'Paired_r', 'Pastel1', 'Pastel1_r', 'Pastel2', 'Pastel2_r', 'PiYG', 'PiYG_r', 'PuBu', 'PuBuGn', 'PuBuGn_r', 'PuBu_r', 'PuOr',
'PuOr_r', 'PuRd', 'PuRd_r', 'Purples', 'Purples_r', 'RdBu', 'RdBu_r', 'RdGy', 'RdGy_r', 'RdPu', 'RdPu_r', 'RdYlBu', 'RdYlBu_r', 'RdYlGn',
'RdYlGn_r', 'Reds', 'Reds_r', 'Set1', 'Set1_r', 'Set2', 'Set2_r', 'Set3', 'Set3_r', 'Spectral', 'Spectral_r', 'Wistia', 'Wistia_r', 'YlGn', 'YlGnBu',
'YlGnBu_r', 'YlGn_r', 'YlOrBr', 'YlOrBr_r', 'YlOrRd', 'YlOrRd_r', 'afmhot', 'afmhot_r', 'autumn', 'autumn_r', 'binary', 'binary_r', 'bone',
'bone_r', 'brg', 'brg_r', 'bwr', 'bwr_r', 'cividis', 'cividis_r', 'cool', 'cool_r', 'coolwarm', 'coolwarm_r', 'copper', 'copper_r', 'cubehelix',
'cubehelix_r', 'flag', 'flag_r', 'gist_earth', 'gist_earth_r', 'gist_gray', 'gist_gray_r', 'gist_heat', 'gist_heat_r', 'gist_ncar', 'gist_ncar_r',
'gist_rainbow', 'gist_rainbow_r', 'gist_stern', 'gist_stern_r', 'gist_yarg', 'gist_yarg_r', 'gnuplot', 'gnuplot2', 'gnuplot2_r', 'gnuplot_r',
'gray', 'gray_r', 'hot', 'hot_r', 'hsv', 'hsv_r', 'inferno', 'inferno_r', 'jet', 'jet_r', 'magma', 'magma_r', 'nipy_spectral', 'nipy_spectral_r',
'ocean', 'ocean_r', 'pink', 'pink_r', 'plasma', 'plasma_r', 'prism', 'prism_r', 'rainbow', 'rainbow_r', 'seismic', 'seismic_r', 'spring',
'spring_r', 'summer', 'summer_r', 'tab10', 'tab10_r', 'tab20', 'tab20_r', 'tab20b', 'tab20b_r', 'tab20c', 'tab20c_r', 'terrain', 'terrain_r',
'turbo', 'turbo_r', 'twilight', 'twilight_r', 'twilight_shifted', 'twilight_shifted_r', 'viridis', 'viridis_r', 'winter', 'winter_r'
源代码:
import jieba
import re
from wordcloud import wordcloud
import matplotlib.pyplot as plt
import imageio
def gen_word_cloud(Words, new_path, contour_color="navajowhite", colormap="hot"):
"""
生成词云
:param colormap:
:param contour_color:
:param Words:以空格分隔的字符串
:param new_path:保存图片的全路径
:return:
"""
image = imageio.imread('./data/img/img.png')
font = r'C:\\Windows\\Fonts\\STKAITI.TTF'
my_wordcloud = wordcloud.WordCloud(height=100, width=100, font_path=font, mask=image,
mode="RGB", background_color="white",
max_words=2000, max_font_size=300, min_font_size=10,
repeat=True, contour_color=contour_color, contour_width=3,
prefer_horizontal=0.6, relative_scaling="auto", colormap='hot').generate(Words)
plt.imshow(my_wordcloud)
plt.axis("off")
plt.show()
my_wordcloud.to_file(new_path)
file = open('./data/a.txt', encoding='utf-8')
iteration = iter(file.readlines())
sentence = []
for line in iteration:
sentence.append(line)
long_sentence = " ".join(sentence)
long_sentence = re.sub("\d", "", long_sentence)
words = jieba.lcut(long_sentence)
new_words = []
for word in words:
if len(word) > 1:
new_words.append(word)
style = [["navajowhite", 'hot'], ["gold", "gist_heat"],
["turquoise", "twilight_shifted"], ["palevioletred", "nipy_spectral"]]
for i in range(len(style)):
gen_word_cloud(' '.join(new_words), './result/bbt'+str(i+1)+'.png',
contour_color=style[i][0], colormap=style[i][1])