TensorflowLite量化原理

一 : 原理

原理公式:

Here:

  • r is the real value (usually float32)
  • q is its quantized representation as a B-bit integer (uint8uint32, etc.)
  • S (float32) and z (uint) are the factors by which we scale and shift the number line. z is the quantized ‘zero-point’ which will always map back exactly to 0.f.

化简:

Consider a floating point variable with range (Xmin, Xmax) that needs to be quantized to the range (0, N_levels − 1) where N_levels = 256 for 8-bits of precision. We derive two parameters: Scale (∆) and Zero-point(z) which map the floating point values to integers . The scale specifies the step size of the quantizer and floating point zero maps to zero-point . Zero-point is an integer, ensuring that zero is quantized with no error. This is important to ensure that common operations like zero padding do not cause quantization error.

卷积原理:

二 : 量化方式

1 . Post Training Quantization

(1) . Weight only quantization

(2) . Quantizing weights and activations

2 . Quantization Aware Training

1 . Post Training Quantization

In many cases, it is desirable to reduce the model size by compressing weights and/or quantize both weights and activations for faster inference, without requiring to re-train the model. Post Training quantization techniques are simpler to use and allow for quantization with limited data.

Post Training Quantization合理说,计算过程皆为Float,而非Int,所以只能在减少模型的大小,速度方面并不能得到提升.

(1) .Weight only quantization

A simple approach is to only reduce the precision of the weights of the network to 8- bits from float. Since only the weights are quantized, this can be done without requiring any validation data .

这种模式,是将模型的weight进行quantized压缩成uint8,但在计算过程中,会将weight进行dequantized回Float.

(2) .Quantizing weights and activations

One can quantize a floating point model to 8-bit precision by calculating the quantizer parameters for all the quantities to be quantized. Since activations need to be quantized, one needs calibration data and needs to calculate the dynamic ranges of activations.

这种模式,在weight quantization的基础上,对某些支持quantized的Kernel,先进行quantization,再进行activation计算,再de-quant回float32,不支持的话会直接使用Float32进行计算,这相对与直接使用Float32进行计算会快一些.

2 . Quantization Aware Training

Quantization aware training models quantization during training and can provide higher accuracies than post quantization training schemes. 

We model the effect of quantization using simulated quantization operations on both weights and activations. For the backward pass, we use the straight through estimator  to model quantization. Note that we use simulated quantized weights and activations for both forward and backward pass calculations.

这种模式,除了会对weight进行quantization,也会在训练过程中,进行模拟量化,求出各个op的max跟min输出,实现不仅仅在训练过程,在测试过程,全程计算过程皆为uint8.不仅仅实现模型的压缩,计算速度也得到提高.

 

三 : Lite文件生成

https://blog.csdn.net/qq_16564093/article/details/78996563

四 : 注意事项

(1) . 当前支持的aware-quant操作:  https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/toco/graph_transformations/quantize.cc

(2) . 当前不支持keras进行aware-quant,得等到tensorflow2.0才支持.

 

五 : 相关文档

(1) https://arxiv.org/pdf/1806.08342.pdf

(2) https://arxiv.org/pdf/1712.05877.pdf

  • 2
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 4
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

程序猿也可以很哲学

让我尝下打赏的味道吧

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值