文献阅读（97）网络压缩综述（部分）

最新推荐文章于 2022-09-13 19:55:35 发布

tiaozhanzhe1900

最新推荐文章于 2022-09-13 19:55:35 发布

阅读量158

点赞数

分类专栏： NPU 文章标签：算法

本文链接：https://blog.csdn.net/tiaozhanzhe1900/article/details/104191876

版权

NPU 专栏收录该内容

76 篇文章 17 订阅

订阅专栏

文章目录

1 缩写 & 引用
2 abstract & introduction
3 量化
- 3.1 fixed-point

题目：Deep Neural Network Approximation for Custom Hardware: Where We’ve Been, Where We’re Going
时间：2019
期刊：ACM Computing Survey
研究机构：伦敦帝国学院

1 缩写 & 引用

2 abstract & introduction

本片综述主要总结了：

高性能神经网络在硬件上的近似方法
比较了不同硬件平台的屋顶模型
评估了每种方法的性能
未来发展的方向

DNN近似算法主要分成两类：量化和权重减枝
性能评估指标：

精度
压缩率
Thoughput: Classifications processing per second
延时
能耗比：单位功耗的throughput

3 量化

3.1 fixed-point

blocking floating point(BFP)=dynamic fixed point
[111] Going deeper with embedded FPGA platform for convolutional neural network 2016 FPGA

formulat an optimisation problem for minimising quantisation error with respect to changes in precision and binary point location

[128]Fixed-point optimization of deep neural networks with adaptive step size retraining

treating quantisation resolution as a trainable parameter

[69] Adaptive quantization of neural networks 2018 ICLR

investigated quantisation at a finer granularity than the aforementioned down-to layer-wise methods
During retraining, networks adapt, with each filter allowed to assume an independent precision.Experiments with small-scale datasets and models showed that Adaptive Quantisation, when combined with pruning,is able to achieve accuracies and compression ratios superior to binarised neural networks

[161] DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients 2016 arXiv