[摘要生成]Boosting Factual Correctness of Abstractive Summarization with Knowledge Graph

最新推荐文章于 2023-12-18 17:21:56 发布

joshuwang0810

最新推荐文章于 2023-12-18 17:21:56 发布

阅读量626

点赞数 1

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/sarach_wong/article/details/112598320

版权

2020
paper: https://arxiv.org/pdf/2003.08612.pdf

切入点：factual correctness

提出两个模型：

Fact-Aware Summarization model, FASUM：which extracts factual relations from the article to build a knowledge graph and integrates it into the neural decoding process.
a Factual Corrector model, FC：that can modify abstractive summaries generated by any
summarization model to improve factual correctness.

结论：

FASUM can generate summaries with higher factual correctness compared with state-of-the-art abstractive summarization systems.
FC improves the factual correctness of summaries generated by various models via only modifying several entity tokens.

细节参考香侬科技，如下只记录了本文关注的点。

测评指标&结果：

测评数据集：CNN/DailyMail 和 Xsum

FC指标：为了测评factual correctness。FactCC模型在xxx进行fine-tune之后用于评估。可以看到整理结果：
Noval n-grams：Diab论文中提到"less abstractive summaries are more factual consistent with the article"，所以作者想看看是否自己的模型"boost factual correctness simply by copying
more portions of the article"。为此，计算了sum中出现article不存在的n-gram的比例，越高说明抽象程度越高。
Relation Matching Rate RMR：为了测评factual correctness。将对事实的评估转化到从summary中抽取到三元组的准确率。具体来说从生成的sum中抽取出三元组合集 $R_s = {(s_i,r_i,o_i)}$ ，同样从原始的article中抽出三元组合集 $R_a$ ， $s_i,r_i,o_i)$ 和 $R_a$ 比较会出现三种情况：Correct hit （ $C$ ）命中、Wrong hit （ $W$ ）、Miss （ $M$ ）就是其他情况。基于此，定义RMR为：
$RMR_1=100 \times \frac{C}{C+W}$
$RMR_2=100 \times \frac{C}{C+W+M}$
为了评估RMR指标的质量，文章计算了人评估和RMR指标之间的correlation coefficient $\gamma$ ，计算得到 $\gamma=0.43$ ，说明了RMR和人工评估结果之前存在可观察的关系。
Natural Language Inference NLI models：为了测评factual correctness。用BERT-large模型在MNLI数据集上进行fine-tune，模型输出三种类型：entailment, neutral and contradiction. 对应到这个任务的度量上，NLI的输入和输出分别是article和sum，通过衡量NLI模型输出的contradiction的比例来衡量争取事实比例，比例越小说明 article和生成摘要的冲突越小。
Human Evaluation：三个人，打分1-3，两个维度 factual correctness和informativeness。效果如下

为了测评FC部分的效果，作者从BottomUP和UNILM生成的sum中随机拿了100条，之后用FC进行correct，对比了corrected前后的效果，类似于业界的GSB吧，效果如下图，说明了FC能boost

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[摘要生成]Boosting Factual Correctness of Abstractive Summarization with Knowledge Graph

2020paper: https://arxiv.org/pdf/2003.08612.pdf切入点：factual correctness提出两个模型：Fact-Aware Summarization model, FASUM：which extracts factual relations from the article to build a knowledge graph and integrates it into the neural decoding process.a Factu
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。