Jay Alammar用直观直白的方式解释了Attention,Transformer和BERT。并辅以很多生动的图例。
Attention
Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)
Transformer
BERT
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
其他资料