自然语言处理
HeatDeath
Learn by doing!
展开
-
关于jieba结巴中文分词的基本尝试
In [1]: import jiebaIn [2]: a = jieba.cut("我来到了清华大学",cut_all=True)In [3]: a Out[3]: <generator object Tokenizer.cut at 0x000001E8E9CBFDB0>In [4]: list(a) Building prefix dict from the default dictionar原创 2017-04-08 01:38:06 · 5706 阅读 · 1 评论 -
使用 wordcloud, jieba, PIL, matplotlib, numpy 进行分词,统计词频,并绘制词云的一次尝试
#coding=utf-8 from wordcloud import WordCloud import jieba import PIL import matplotlib.pyplot as plt import numpy as np def wordcloudplot(txt): path = r'ancient_style.ttf' # path = unicode(pat原创 2017-04-08 15:36:45 · 3101 阅读 · 1 评论