原文 7
Self-attention, sometimes called intra-attention, is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence. Self-attention has been used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment and learning task-independent sentence representations.
翻译
自注意力(亦称内部注意力)是一种通过关联单个序列内部不同位置来计算序列表征的机制,已成功应用于阅读理解、抽象摘要、文本蕴含、以及与任务无关的句子表征学习等多个领域。
重点句子解析
- Self-attention, sometimes called intra-attention, is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence.
【解析】
这句话的主干是:Self-attention is an attention mechanism. 原句中两个逗号之间的sometimes called intra-attention是插入语,对Self-attention进行补充说明,相当于一个被动语态的定语从句:which is sometimes called intra-attention;现在分词短语relating different positions of a single sequence做后置定语,修饰attention mechanism;其中的介词短语of a single sequence也是后置定语,修饰positions;不定式短语in order to compute a representation of the sequence做目的状语;其中的介词短语of the sequence是后置定语,修饰a representation。
【参考翻译】
自注意力(亦称内部注意力)是一种通过关联单个序列内部不同位置来计算序列表征的机制,
- Self-attention has been used successfully in a variety of tasks including reading comprehension, abstractive summarization, textual entailment and learning task-independent sentence representations.
【解析】
句子的主干是:Self-attention has been used. 后边的successfully是程度副词,做状语;in a variety of tasks是表示应用范围的介词短语,也是做状语;including…用于举例说明,修饰a variety of tasks,做后置定语。其中,and连接了四个并列的名词或动名词短语。
【参考翻译】
自注意力已成功应用于阅读理解、抽象摘要、文本蕴含、以及与任务无关的句子表征学习等多个领域。
原文 8
End-to-end memory networks are based on a recurrent attention mechanism instead of sequence-aligned recurrence and have been shown to perform well on simple-language question answering and language modeling tasks.
翻译
端到端记忆网络基于循环注意力机制而非序列对齐的循环结构。实践证明,该网络在简单语言问答和语言建模任务中表现优异。
重点句子解析
【解析】
这是一个独立成段的句子。该句包含了两个并列的谓语结构,且谓语分别是一般现在时的被动语态(are based on)和现在完成时的被动语态(have been shown)。其主干可以概括为:主语(End-to-end memory networks)+谓语1(are based on)+宾语1(a recurrent attention mechanism) +并列谓语2(have been shown to perform well)。原句中的instead of sequence-aligned recurrence表示对比关系,意思是:而不是…。介词短语on simple-language question answering and language modeling tasks做状语,表示任务的范围。其中and连接了两个并列的动名词短语,即:question answering和language modeling,它们共同修饰tasks。
原文 9
To the best of our knowledge, however, the Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence-aligned RNNs or convolution. In the following sections, we will describe the Transformer, motivate self-attention and discuss its advantages over models such as and .
翻译
但据我们所知,Transformer是首个完全依赖自注意力机制计算输入输出表征的转换模型,无需使用序列对齐的RNN或卷积结构。在后续章节中,我们将描述Transformer架构,阐述自注意力机制的原理,并论述其相对于[17,18][9]等模型的优势。
重点句子解析
- To the best of our knowledge, however, the Transformer is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence-aligned RNNs or convolution.
【解析】
句子的主干是:the Transformer is the first transduction model. 句首的To the best of our knowledge做状语,意为:据我们所知。两个逗号之间的插入语however也是做状语,表转折关系,翻译的时候需要提到句首。 relying entirely on self-attention是现在分词短语做后置定语,修饰transduction model;不定式to compute representations…做目的状语;介词短语of its input and output做后置定语,修饰representations;介词短语without using sequence-aligned RNNs or convolution是条件状语,修饰compute。
【参考翻译】
但据我们所知,Transformer是首个完全依赖自注意力机制计算输入输出表征的转换模型,无需使用序列对齐的RNN或卷积结构。
- In the following sections, we will describe the Transformer, motivate self-attention and discuss its advantages over models such as and.
【解析】
句首的介词短语In the following sections做状语,交代动作放生的时间或位置;后边的主句是主谓宾结构,只不过主语“we”后边是三个并列的动宾短语;其中,第三个动宾短语后边使用了介词短语over…做后置定语,修饰advantages;“advantages over…” 表示“相对于…的优势”。such as …用于对前边的models进行举例说明。
【参考翻译】
在后续章节中,我们将描述Transformer架构,阐述自注意力机制的原理,并论述其相对于[17,18][9]等模型的优势。