模型加速与压缩方法分类总结
• Low-Rank
• Pruning
• Quantization
• Knowledge Distillation
• Compact Network Design
Low-Rank
Previous low-rank based methods:
• SVD
- Zhang et al., “Accelerating Very Deep Convolutional Networks for Classification and Detection”. IEEE TPAMI 2016.
• CP decomposition
- Lebedev et al., “Speeding-up Convolutional Neural Networks Using Fine-tuned CP- Decomposition”. ICLR 2015.
• Tucker decomposition
- Kim et al., “Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications”. ICLR 2016.
• Tensor Train Decomposition
- Novikov et al., “Tensorizing Neural Networks”. NIPS 2016.
• Block Term Decomposition
- Wang et al., “Accelerating Convolutional Neural Networks for Mobile Applications”. ACMMM 2016.
Recent low-rank based methods:
• Tensor Ring (TR) factorizations
- Wang et al., “Wide Compression: Tensor Ring Nets”. CVPR2018
• Block Term Decomposition For RNN
- Ye et al., “Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition ”. CVPR2018.
Why low-rank is not popular anymore?
• Low-rank approximation is not efficient for those 1x1 convolutions
• 3x3 convolutions in bottleneck structure have less computation complexity
• Depthwise convolution or grouped 1x1 convolution is already quite fast.
Pruning
Recent progress in pruning :
• Structured Pruning
– Yoon et al. “Combined Group and Exclusive Sparsity for Deep Neural Networks”. ICML2017
– Ren et al. “SBNet: Sparse Blocks Network for Fast Inference”. CVPR2018
• Filter Pruning
– Luo et al. “Thinet: A filter level pruning method for deep neural network compression”. ICCV2017
– Liu et al., “Learning efficient convolutional networks through network slimming”. ICCV2017
– He et al. “Channel Pruning for Accelerating Very Deep Neural Networks”. ICCV2017
• Gradient Pruning
– Sun et al. “meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting”. ICML2017
• Fine-grained Pruning in a Bayesian View
– Molchanov et al. “Variational Dropout Sparsifies Deep Neural Networks”. ICML2017
Structured Pruning
Previous group pruning methods mainly use the group sparsity,Yoon et al. use both group sparsity and exclusive sparsity
• Group Sparsity: Impose sparsity regularization on grouped features to prune columns of weight matrix.