Soft weight-sharing for neural network compression
Abstract
- soft weight-sharing
- achieves both quantization and pruning in one simple (re-)training procedure
- the relation between compression and the minimum description length (MDL) principle
Intro
- compression直接相关:(variational) Bayesian inference and
the minimum description principle - Hinton1992,soft weight sharing 这个是作为一种正则化的方法用于降低网络的复杂性,防止过拟合
- 他主要的构想也是quant的思想,文章中提到,对比韩松的方法,需要一步一步的进行prune和quant然后在retrain,这篇文章引入soft weight sharing的方法说可以在retain的过程中一起实现prune和quant;
Method
- 需要优化的目标函数: