- 题目:Deep Neural Network Approximation for Custom Hardware: Where We’ve Been, Where We’re Going
- 时间:2019
- 期刊:ACM Computing Survey
- 研究机构:伦敦帝国学院
1 缩写 & 引用
2 abstract & introduction
本片综述主要总结了:
- 高性能神经网络在硬件上的近似方法
- 比较了不同硬件平台的屋顶模型
- 评估了每种方法的性能
- 未来发展的方向
DNN近似算法主要分成两类:量化和权重减枝
性能评估指标:
- 精度
- 压缩率
- Thoughput: Classifications processing per second
- 延时
- 能耗比:单位功耗的throughput
3 量化
3.1 fixed-point
blocking floating point(BFP)=dynamic fixed point
[111] Going deeper with embedded FPGA platform for convolutional neural network 2016 FPGA
formulat an optimisation problem for minimising quantisation error with respect to changes in precision and binary point location
[128]Fixed-point optimization of deep neural networks with adaptive step size retraining
treating quantisation resolution as a trainable parameter
[69] Adaptive quantization of neural networks 2018 ICLR
investigated quantisation at a finer granularity than the aforementioned down-to layer-wise methods
During retraining, networks adapt, with each filter allowed to assume an independent precision.Experiments with small-scale datasets and models showed that Adaptive Quantisation, when combined with pruning,is able to achieve accuracies and compression ratios superior to binarised neural networks
[161] DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients 2016 arXiv
DoReFa-Net, supports arbitrary precisions for weights, activations, and gradients, from 32-bit fixed point down
to binary