自从BERT发布以后, NLP领域基于BERT的改进也层出不穷。nlp_paper github的一个总结。
下面是BERT系列论文阅读计划, 有些之前做过笔记,就会直接贴过来连接。
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - (NAACL 2019)
- ERNIE 2.0: A Continual Pre-training Framework for Language Understanding - (arXiv 2019)
- StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding - (arXiv 2019)
- RoBERTa: A Robustly Optimized BERT Pretraining Approach - (arXiv 2019)
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations - arXiv 2019)
- Multi-Task Deep Neural Networks for Natural Language Understanding - arXiv 2019)
- What does BERT learn about the structure of language? (ACL2019)
- Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned (ACL2019) [github]
- Open Sesame: Getting Inside BERT’s Linguistic Knowledge (ACL2019 WS)
- Analyzing the Structure of Attention in a Transformer Language Model (ACL2019 WS)
- What Does BERT Look At? An Analysis of BERT’s Attention (ACL2019 WS)
- Do Attention Heads in BERT Track Syntactic Dependencies?
- Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains (ACL2019 WS)
- Inducing Syntactic Trees from BERT Representations (ACL2019 WS)
- A Multiscale Visualization of Attention in the Transformer Model (ACL2019 Demo)
- Visualizing and Measuring the Geometry of BERT
- How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings (EMNLP2019)
- Are Sixteen Heads Really Better than One? (NeurIPS2019)
- On the Validity of Self-Attention as Explanation in Transformer Models
- Visualizing and Understanding the Effectiveness of BERT (EMNLP2019)
- Attention Interpretability Across NLP Tasks
- Revealing the Dark Secrets of BERT (EMNLP2019)
- Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs (EMNLP2019)
- The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives (EMNLP2019)
- A Primer in BERTology: What we know about how BERT works
- Do NLP Models Know Numbers? Probing Numeracy in Embeddings (EMNLP2019)
- How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations (CIKM2019)
- Whatcha lookin’ at? DeepLIFTing BERT’s Attention in Question Answering
- What does BERT Learn from Multiple-Choice Reading Comprehension Datasets?
- Calibration of Pre-trained Transformers
- exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models [github]
- MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices [github]