TextRank4ZH 使用教程

邴富畅Pledge

于 2024-08-09 07:12:42 发布

阅读量303

点赞数 5

本文链接：https://blog.csdn.net/gitblog_00102/article/details/141042748

版权

TextRank4ZH 使用教程

TextRank4ZH:deciduous_tree:从中文文本中自动提取关键词和摘要项目地址:https://gitcode.com/gh_mirrors/te/TextRank4ZH

项目介绍

TextRank4ZH 是一个针对中文文本的关键词抽取与摘要生成工具。它基于经典的 TextRank 算法，为你的数据挖掘和自然语言处理任务提供了强大的支持。这个库是由 Python 编写，易于安装且兼容多个 Python 版本。

项目快速启动

安装

首先，你需要安装 TextRank4ZH 包。可以使用以下命令通过 pip 安装：

pip install textrank4zh

基本使用

以下是一个简单的示例，展示如何使用 TextRank4ZH 进行关键词提取和摘要生成：

from textrank4zh import TextRank4Keyword, TextRank4Sentence

text = """
TextRank算法可以用来从文本中提取关键词和摘要（重要的句子）。TextRank4ZH是针对中文文本的TextRank算法的python算法实现。
"""

# 关键词提取
tr4w = TextRank4Keyword()
tr4w.analyze(text=text, lower=True, window=2)
print('关键词：')
for item in tr4w.get_keywords(5, word_min_len=2):
    print(item.word, item.weight)

# 摘要生成
tr4s = TextRank4Sentence()
tr4s.analyze(text=text, lower=True)
print('摘要：')
for item in tr4s.get_key_sentences(num=2):
    print(item.index, item.weight, item.sentence)

应用案例和最佳实践

电子商务评论摘要

假设我们有一些电子商务产品评论的文本数据，我们希望从中提取出一些摘要，以便快速了解用户对产品的评价。以下是一个示例代码：

from textrank4zh import TextRank4Sentence

# 创建TextRank4Sentence对象
tr4s = TextRank4Sentence()

# 假设我们有一些电子商务产品评论的文本数据
reviews = [
    "这个产品很好用，速度很快，效果很好。",
    "非常失望，根本没达到预期效果。",
    "这个产品价格很便宜，性价比很高。",
    "质量不错，值得购买。",
    "不推荐购买，质量很差。",
    "功能很强大，很好用。"
]

# 添加文本数据到TextRank4Sentence对象
for review in reviews:
    tr4s.analyze(text=review, lower=True, source='no_stop_words')

# 提取评论摘要
summary_list = tr4s.get_key_sentences(num=2)

# 输出摘要
for summary in summary_list:
    print(summary.sentence)