Awesome Knowledge-Distillation
- Awesome Knowledge-Distillation
Different forms of knowledge
Knowledge from logits
- Distilling the knowledge in a neural network. Hinton et al. arXiv:1503.02531
- Learning from Noisy Labels with Distillation. Li, Yuncheng et al. ICCV 2017
- Training Deep Neural Networks in Generations:A More Tolerant Teacher Educates Better Students. arXiv:1805.05551
- Knowledge distillation by on-the-fly native ensemble. Lan, Xu et al. NIPS 2018
- Learning Metrics from Teachers: Compact Networks for Image Embedding. Yu, Lu et al. CVPR 2019
- Relational Knowledge Distillation. Park, Wonpyo et al, CVPR 2019
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer. Huang, Zehao and Wang, Naiyan. 2017
- On Knowledge Distillation from Complex Networks for Response Prediction. Arora, Siddhartha et al. NAACL 2019
- On the Efficacy of Knowledge Distillation. Cho, Jang Hyun and Hariharan, Bharath. arXiv:1910.01348. ICCV 2019
- [noval]Revisit Knowledge Distillation: a Teacher-free Framework. Yuan, Li et al. arXiv:1909.11723
- Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher. Mirzadeh et al. arXiv:1902.03393
- Ensemble Distribution Distillation. ICLR 2020
- Noisy Collaboration in Knowledge Distillation. ICLR 2020
- On Compressing U-net Using Knowledge Distillation. arXiv:1812.00249
- Distillation-Based Training for Multi-Exit Architectures. Phuong, Mary and Lampert, Christoph H. ICCV 2019
- Self-training with Noisy Student improves ImageNet classification. Xie, Qizhe et al.(Google) CVPR 2020
- Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework. arXiv:1910.12061
- Preparing Lessons: Improve Knowledge Distillation with Better Supervision. arXiv:1911.07471
- Adaptive Regularization of Labels. arXiv:1908.05474
- Positive-Unlabeled Compression on the Cloud. Xu, Yixing(HUAWEI) et al. NIPS 2019
- Snapshot Distillation: Teacher-Student Optimization in One Generation. Yang, Chenglin et al. CVPR 2019
- QUEST: Quantized embedding space for transferring knowledge. Jain, Himalaya et al. CVPR 2020(pre)
- Conditional teacher-student learning. Z. Meng et al. ICASSP 2019
- Subclass Distillation. Müller, Rafael et al. arXiv:2002.03936
- MarginDistillation: distillation for margin-based softmax. Svitov, David & Alyamkin, Sergey. arXiv:2003.02586
- An Embarrassingly Simple Approach for Knowledge Distillation. Gao, Mengya et al. MLR 2018
- Sequence-Level Knowledge Distillation. Kim, Yoon & Rush, Alexander M. arXiv:1606.07947
- Boosting Self-Supervised Learning via Knowledge Transfer. Noroozi, Mehdi et al. CVPR 2018
- Meta Pseudo Labels. Pham, Hieu et al. ICML 2020
Knowledge from intermediate layers
- Fitnets: Hints for thin deep nets. Romero, Adriana et al. arXiv:1412.6550
- Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. Zagoruyko et al. ICLR 2017
- Knowledge Projection for Effective Design of Thinner and Faster Deep Neural Networks. Zhang, Zhi et al. arXiv:1710.09505
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning. Yim, Junho et al. CVPR 2017
- Paraphrasing complex network: Network compression via factor transfer. Kim, Jangho et al. NIPS 2018
- Knowledge transfer with jacobian matching. ICML 2018
- Self-supervised knowledge distillation using singular value decomposition. Lee, Seung Hyun et al. ECCV 2018
- Variational Information Distillation for Knowledge Transfer. Ahn, Sungsoo et al. CVPR 2019
9 - Knowledge Distillation via Instance Relationship Graph. Liu, Yufan et al. CVPR 2019
- Knowledge Distillation via Route Constrained Optimization. Jin, Xiao et al. ICCV 2019
- Similarity-Preserving Knowledge Distillation. Tung, Frederick, and Mori Greg. ICCV 2019
- MEAL: Multi-Model Ensemble via Adversarial Learning. Shen,Zhiqiang,