Paper地址:https://arxiv.org/abs/1902.08153
GitHub地址 (PyTorch):GitHub - zhutmost/lsq-net: Unofficial implementation of LSQ-Net, a neural network quantization framework
基本量化设置
- 计算结点伪量化:
- Weight跟Activation都采用Per-tensor量化;
- Scaling factor (Paper标记为Step size)作为量化参数,是可训练变量;
- 另外,针对TensorRT、MNN等推理引擎,Weight通常执行Per-channel量化,Activation执行Per-tensor量化;为了加快量化训练收敛,Activation的量化参数(可训练)可借助KL量化、或PyTorch observer量化予以初始化,Weight的量化参数则根据absmax方法在线更新;