awesome-text-summarization
github地址:https://github.com/luopeixiang/awesome-text-summarization
这个仓库有对文本摘要的简单介绍,包括任务定义,类型,摘要评估方法等等,
同时整理收集了当前文本摘要方面常用的数据集以及相关最新论文,
适合想要快速上手了解文本摘要领域的人。
Table of Contents
- awesome-text-summarization
Basic Concept
Definition
Summarization is the task of producing a shorter version of one or several documents that preserves most of the input’s meaning.
Types of summarization
Extractive summaries (extracts) are produced by concatenating
several sentences taken exactly as they appear in the materials being
summarized.
Abstractive summaries (abstracts), are written to convey
the main information in the input and may reuse phrases or clauses
from it, but the summaries are overall expressed in the words of the
summary author.
Summary Informativeness evaluation
- ROUGE-N: measures the N-gram units common between a particular summary and a col-
lection of reference summaries where N determines the N-gram’s length. E.g., ROUGE-1
for unigrams and ROUGE-2 for bi-grams. - ROUGE-L: computes Longest Common Subsequence (LCS) metric.
- BLUE : BLEU is basically calculated on the n-gram co-occerance between the generated summary and the gold (You don’t need to specify the “n”