Generating Summaries with Topic Templates and Structured Convolutional Decoders笔记

介绍一种基于结构化解码器的摘要生成方法,使用CNN编码器和LSTM生成句子向量,结合主题模型增强话题敏感度,实验证明在ROUGE-1指标上有显著提升。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

用的是卷积解码器,和已知的经典RNN的seq2seq是不同的,有更好的内容覆盖率

一、定义

导言

文档被组织成主题连贯的文本片段,在讨论的内容方面表现出特定的结构

某些主题可能以特定的顺序进行讨论

(比如描述物种的顺序一般是:类型、发现的地区、栖息地)

任务

和18年WikiSum一样

二、模型

我们的模型由一个结构化的译码器组成,它被训练来预测一系列应该在摘要中讨论的句子主题,并基于这些主题生成句子。

  • encoder: CNN
  • decoder:
    • document-level decoder first generates sentence vectors (LSTM)
      • 每一个时间步t,根据 h t − 1 h_{t-1} ht1 s t − 1 s_{t-1} st1用LSTM生成隐状态向量 h t h_{t} ht,通过注意力层输出代表句子的序列向量 s t s_{t} st
    • sentence-level decoder is then applied to generate an actual sentence token-by-token(CNN)
      • 这个CNN融合了embedding
      • 将每个目标词 y t i y_{ti} yti的词表示 w t i w_{ti} wti与表示该词在句子中的位置的向量 e i e_i ei组合, w t i = e m b ( y t i ) + e i w_{ti}= emb(y_{ti}) + e_i wti=emb(yti)+ei
主题模型

为了使得the document-level decoder 更加 topic-aware,

把每个句子看做一个文档,并利用LDA模型分析其中隐含的主题列表K,并训练了一个分类器为每一个句子打上最可能的主题标签

(分配的标签貌似是来自句子里的一些关键词)

三、实验

作者用的是自己构造的数据集WIKICATSUM实验,结果如下

自动评价
  • 结构化译码器使ROUGE-1 (R1)有了很大的改进
  • 使用主题标签(+T)的变体平均提高了+2分
  • 有些领域可以作者的模型超过谷歌的Transformer sequence-to-sequence 模型,有些不行
人工评价
  • 每个文章问几个问题,读者读完摘要后能否回答这些问题
    • (评价摘要是否保留了输入段落中的重要信息)
  • 问3个问题,评估总结的总体内容和语言质量
    • (Content、Fluency、Succinctness)

四、疑惑

Wikipedia lead section是什么?在维基百科里对应哪部分?

答:
在这里插入图片描述

查了一下,应该是在内容表之前的简介部分

sentence-level decoder 是怎么用CNN通过注意机制引入的?

这个得读一下作者引用的CNN-att论文orz

Surely, I can assist you in generating an IELTS sample article on the topic of advertisement, utilizing advanced vocabulary and phrases. Advertising is a prominent aspect of modern-day society, and the increasing sophistication in advertising techniques is a reflection of the advancement in technology. In this article, we will delve into the positive and negative impacts of advertising on consumers and society at large. On the one hand, advertising can be viewed as a beneficial tool for informing the public about new products or services. At its finest, advertising employs techniques that persuade its audience to make informed decisions, thus resulting in mutually beneficial outcomes for both the consumer and the advertiser. However, the majority of advertisements are designed to entice consumers to purchase things they do not necessarily need or necessarily want. This form of advertising can result in financial strain on consumers, and can even contribute to the growing levels of debt amongst individuals and families. Moreover, advertising has a profound psychological impact on individuals, which can lead to significant harm. Contemporary advertising methods often utilize triggering tactics, and persuasive language that manipulates the emotions of consumers, with a view of compelling them to buy services or products that they do not genuinely need. This can ultimately lead to a dependence on consumption, where individuals feel compelled to purchase things just to maintain a certain status. Furthermore, advertising can have detrimental societal effects, particularly when advertisements contain gender biases or promote harmful stereotypes. Research has shown that advertisements have the ability to influence and reinforce specific gender roles or behavioral patterns, which can lead to gender inequality and discrimination in society. In conclusion, while advertising can be beneficial to both the consumer and the advertiser, the negative effects of advertising cannot be ignored. It is, therefore, necessary for various stakeholders, including advertisers, regulators, and consumers, to collaborate effectively in managing the impacts of advertising on society. This will lead to an ethical and responsible advertising industry that is mindful of societal values and the welfare of consumers.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值