Transformer: Attention Is All You Need,NIPS 2017

Transformer: Attention Is All You Need, NIPS 2017

=============================================================

NLP Model Evolution:

Transformer 编码器层堆叠6层,解码器层堆叠6层

Bert             编码器层堆叠  Base12层, Large 24层

GPT-1 解码器层堆叠 12层

GPT-2 解码器层堆叠 24层 36层 48层

GPT-3 解码器层堆叠 96层

=============================================================

Transformer Encode/Decoder Layer Blocks:

Feed Forward 就是/来自 FFNN/MLP , NNLM Bengio 2003;

Add& Normal 就是/来自 Layer Normalization,Jimmy Lei Ba 2016;

Residual Learning 来自 ResNet Kaiming, 2015;

Self Attention就是一层网络,Multi-Head就是搞出来8个不同的层代码,就是一次numpy.reshape;

创新多的稍微费解的就是Muliti-head Attention,先读代码,再回头看paper。

=============================================================
NIPS 2017  https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Attention Is All You Need   https://arxiv.org/abs/1706.03762
Transformer (Google AI blog) , https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
-

Transformers from scratch,,Transformers from scratch | peterbloem.nl

-

The Illustrated Transformer   The Illustrated Transformer – Jay Alammar – Visualizing machine learning one concept at a time.
【中】  https://blog.csdn.net/yujianmin1990/article/details/85221271  BERT大火却不懂Transformer?读这一篇就够了 - 知乎

Layer Normalization  https://arxiv.org/pdf/1607.06450.pdf
Image Transformer    https://arxiv.org/pdf/1802.05751.pdf
Music Transformer    https://arxiv.org/pdf/1809.04281.pdf
--

Understanding Transformers: A Step-by-Step Math Example — Part 1,

https://blog.gopenai.com/understanding-transformers-a-step-by-step-math-example-part-1-a7809015150a

--

TensorFlow official implementation of Transformer:
        The implementation leverages tf.keras and makes sure it is compatible with TF 2.x.
          https://github.com/tensorflow/models/tree/master/official/nlp/transformer
1.Google注解Transformer, TF2.x Keras理解语言的 Transformer 模型  |  TensorFlow Core
2.harvardnlp: The Annotated Transformer, PyTorch The Annotated Transformer
Version2022: http://nlp.seas.harvard.edu/annotated-transformer/
 中文,  https://daiwk.github.io/posts/platform-tensor-to-tensor.html
Lilian Weng,,Attention? Attention!   The Transformer Family Version 2.0

The Transformer Family Version 2.0 | Lil'Log


-

chao-ji/tf-transformer 文档很好, https://github.com/chao-ji/tf-transformer
Create The Transformer With Tensorflow 2.0, https://trungtran.io/2019/04/29/create-the-transformer-with-tensorflow-2-0/
Transformer implementation in TensorFlow with notes, https://blog.varunajayasiri.com/ml/transformer.html   OK


Transformer/tensor2tensor Github   https://github.com/tensorflow/tensor2tensor/
Tensor2Tensor Colab  
https://colab.research.google.com/github/tensorflow/tensor2tensor/blob/master/tensor2tensor/notebooks/hello_t2t.ipynb
解析Google Tensor2Tensor系统, 张金超“变形金刚”为何强大:从模型到代码全面解析Google Tensor2Tensor系统-腾讯云开发者社区-腾讯云

Ashish Vaswani的视频
Stanford CS224N: NLP with Deep Learning | Winter 2019 | Lecture 14 – Transformers and Self-Attention
Ashish Vaswani & Anna Huang, Google  https://www.youtube.com/watch?v=5vcj8kSwBCY

-

RAAIS 2019 - Ashish Vaswani, Senior Research Scientist at Google AI
https://www.youtube.com/watch?v=bYmeuc5voUQ

Attention is all you need;  Łukasz Kaiser | Masterclass
https://www.youtube.com/watch?v=rBCqOTEfxvg

[Transformer] Attention Is All You Need | AISC Foundational
https://www.youtube.com/watch?v=S0KakHcj_rs

=============================================================

-

-

-

-

-

--

=============================================================


Illustrated Guide to Transformers- Step by Step Explanation
https://towardsdatascience.com/illustrated-guide-to-transformers-step-by-step-explanation-f74876522bc0

理解Transformer的三层境界  论文共读笔记(2)理解Transformer的三层境界--Attention is all you need - 简书

Transformer of 2 stacked encoders and decoders:

Transformer of 2 stacked encoders and decoders


Leo Dirac == 量子力学奠基人 Paul Dirac  保罗·狄拉克之孙

00.LSTM is dead. Long Live Transformers!--2019

https://www.youtube.com/watch?v=S27pHKBEp30
Transformer的细枝末节 碎碎念:Transformer的细枝末节 - 知乎

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值