Re35：读论文 ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining

诸神缄默不语

已于 2022-10-27 16:17:09 修改

阅读量376

点赞数

分类专栏：人工智能学习笔记文章标签： legalAI 文本摘要生成式摘要自然语言处理 NLP

于 2022-10-27 16:16:12 首次发布

本文链接：https://blog.csdn.net/PolarisRisingWar/article/details/127552064

版权

人工智能学习笔记专栏收录该内容

243 篇文章 257 订阅

订阅专栏

诸神缄默不语-个人CSDN博文目录

论文名称：ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining
论文下载地址：https://aclanthology.org/2022.coling-1.540/
官方GitHub项目：GitHub - EngSalem/arglegalsumm

本文是2022年COLING文章，作者来自匹兹堡大学。
本文关注法律文档的生成式摘要任务，解决方案是对句子进行role labeling，识别出arguments，然后使用seq2seq预训练模型实现摘要生成。

（附件部分还没有写）

1. Motivation

address their argumentative nature（我也不知道这啥意思，反正就是说这一点很重要）
因此使用argument role labeling，从法律文本中抽取argument roles

相关课题：
argument mining：将文本的argumentative structure表示为图结构（包含argument roles及其之间的关系）
抽取argument units→分类units的argument roles→检测其间的关系
通用域常用类别：claims, major claims, and premises
法律文档中的IRC taxonomy：Issues, Reasons, and Conclusions

以前典型使用argument mining结合摘要生成的方法：抽取；把argument graph线性化为文本格式

2. ArgLegalSumm方法

在这里插入图片描述
（两部分是解耦的）

用special marker tokens（句子级别）
测试不同粒度的效果（2 markers & 6 markers）：
在这里插入图片描述

用contextualized embedding-based techniques实现句子级别的分类：BERT RoBERTa legalBERT（最后选择用legalBERT，因为效果最好）

3. 实验

3.1 数据集

数据获取自Canadian Legal Information Institute (CanLII)

文本：1262个法律案例-摘要对，8-1-1划分数据集
最长26k单词：使用Longformer等可以编码长文档的模型

Issues (legal questions which a court addressed in the document)
Reasons (pieces of text which indicate why the court reached the specific conclusions)
Conclusions (court’s decisions for the corresponding issues)

在这里插入图片描述
（这个比例证明摘要中arguments更重要，所以本文的motivation有效，细节略）

3.2 主实验结果

argument role detection部分：
在这里插入图片描述

摘要生成部分：
在这里插入图片描述

3.3 baseline

抽取式摘要模型：无监督学习，BERT+K-Means（靠近质心的句子）¹

生成式摘要模型

Vanilla BART-Large
Vanilla LED-base

3.4 实验设置

本文考虑了2种setting：人工标注argument roles（oracle）和预测argument role labels（predicted）

3.5 模型分析

在这里插入图片描述

用人工抽取的arguments句子作为真实摘要标签：
在这里插入图片描述

When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Dataset ↩︎

诸神缄默不语

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
Re35：读论文 ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining

论文阅读笔记：ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining
复制链接

扫一扫