jieba分词中已有成熟的计算TF-IDF和textrank模块, 可直接调用.
以下功能切记先导入 import jieba.analyse
然后设定停用词jieba.analyse.set_stop_words('stop_words.txt')
一 调用jieba自带的TF-IDF获取权重
# sentence为待处理字符串; topK为返回关键字数量, 默认为20, 取0即返回所有; withWeight为return值是否包含权重
tfidf_values = jieba.analyse.tfidf(sentence, topK=0, withWeight=True)
二 调用jieba自带的textrank获取权重
# sentence为待处理字符串; topK为返回关键字数量, 默认为20, 取0即返回所有; withWeight为return值是否包含权重
textrank_values = jieba.analyse.textrank(sentence, topK=0, withWeight=True)