【问题】面对大篇幅的文章,想要通过关键词了解其中心思想,但是如何通过Python提取中文关键词呢?
【解法】入门级的中文关键词提取算法可以采用TF-idf和textrank具体的语句如下:
TF-idf:
from jieba.analyse import *
data = open('usercontent.txt').read()#读取文件
for keyword,weight in extract_tags(data,withweight = True ):
print('%s %s' %(keyword,weight))
textrank:
from jieba.analyse import *
data = open('usercontent.txt').read()#读取文件
for keyword,weight in textrank(data,withweight = True ):
print('%s %s' %(keyword,weight))