分词/关键词提取
seg = jieba.cut(content)
jieba.analyse.set_stop_words('stopword.txt')
keyWord = jieba.analyse.extract_tags(
'|'.join(seg), topK=20, withWeight=True, allowPOS=())
词性标注
>>> import jieba.posseg as pseg
>>> words =pseg.cut("我爱北京天安门")
>>> for w in words:
... print(w.word,w.flag)
...
我 r
爱 v
北京 ns
天安门 ns