python 英文文本中的关键词提取

python 英文关键词提取详细教程:

https://opensourcelibs.com/lib/pytextrank
# To install from PyPi:   慢就加镜像 -i
python3 -m pip install pytextrank
# 如果语言模型下载缓慢 可以直接在terminal 复制地址本地下载,
# 然后将wheel移动到目录下 pip install 即可
python3 -m spacy download en_core_web_sm

# If you work directly from this Git repo, be sure to install the dependencies as well:
python3 -m pip install -r requirements.txt

#Alternatively, to install dependencies using conda:
conda env create -f environment.yml
conda activate pytextrank
Then to use the library with a simple use case:

import spacy
import pytextrank

# example text
text = "Compatibility of systems of linear constraints over the set of natural numbers. Criteria of compatibility of a system of linear Diophantine equations, strict inequations, and nonstrict inequations are considered. Upper bounds for components of a minimal set of solutions and algorithms of construction of minimal generating sets of solutions for all types of systems are given. These criteria and the corresponding algorithms for constructing a minimal supporting set of solutions can be used in solving all the considered types systems and systems of mixed types."

# load a spaCy model, depending on language, scale, etc.
nlp = spacy.load("en_core_web_sm")

# add PyTextRank to the spaCy pipeline
nlp.add_pipe("textrank")
doc = nlp(text)

# examine the top-ranked phrases in the document
for phrase in doc._.phrases:
	# 关键词
    print(phrase.text)
    # 权重、词频
    print(phrase.rank, phrase.count)
    # list
    print(phrase.chunks)

相关参考链接:
这个print 用的好:
https://blog.csdn.net/Cocktail_py/article/details/113339568
这个说明挺全:
https://blog.csdn.net/make_progress/article/details/116943867

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值