DL之Transformer:《The Illustrated Transformer—图解Transformer》翻译与解读
目录
Paper:《The Illustrated Transformer》翻译与解读
3、Bringing The Tensors Into The Picture 将张量引入图像
4.1、Self-Attention at a High Level
What are the “query”, “key”, and “value” vectors?
第6步、对加权值向量求和得出self-attention的输出
4.3、Matrix Calculation of Self-Attention
4.4、The Beast With Many Heads 理解"多头兽"
4.6、What might this pattern look like?
6、The Final Linear and Softmax Layer 最后一个线性层+softmax层将输出浮点向量转为一个词
如何选择最终输出?两种方法:greedy方法、beam search方法
Paper:《The Illustrated Transformer》翻译与解读
作者 |