TfLite: model Post-training quantization(生成量化模型和部署)

最新推荐文章于 2025-03-19 22:26:23 发布

shuai_wen

最新推荐文章于 2025-03-19 22:26:23 发布

阅读量4.4k

点赞数 3

分类专栏：人工智能

本文链接：https://blog.csdn.net/u011279649/article/details/103808343

版权

本文详细介绍了TensorFlow Lite的后训练量化技术，包括混合量化和整数量化。混合量化在保持模型大小和延迟的同时，仍需浮点计算。整数量化则完全使用8位整数，适用于资源有限的硬件。文章通过创建和量化MNIST模型，展示了如何生成float、hybrid和integer量化模型，并探讨了量化过程中遇到的问题和解决方案，以及量化前后模型的性能对比。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Post-training quantization的方法

tensorflow lite model 的quantization的方法有两种：

“hybrid” post training quantization and post-training integer quantization

“hybrid” post training quantization approach reduced the model size and latency in many cases, but it has the limitation of requiring floating point computation, which may not be available in all hardware accelerators (i.e. Edge TPUs).

post-training integer quantization enables users to take an already-trained floating-point model and fully quantize it to only use 8-bit signed integers (i.e. `int8`). By leveraging this quantization scheme, we can get reasonable quantized model accuracy across many models without resorting to retraining a model with quantization-aware training. With this new tool, models will continue to be 4x smaller, but will see even greater CPU speed-ups. Fixed point hardware accelerators, such as Edge TPUs, will also be able to run these models.

量化的原理

1] 每轴（或每通道）或每张量的权重用int8进行定点量化的可表示范围为[-127，127]，且zero-point就是量化值0
2] 每张量的激活值或输入值用int8进行定点量化的可表示范围为[-128，127]，其zero-point在[-128，127]内依据公式求得

量化参数：
S: Rmax - Rmin / Qmax - Qmin = Scale: 每个Q单位表示多大的Real value
Z: Zero = Qmax - Rmax / S: Real zero表示多大的Q value

明确权重的量化和激活值的量化方法不同：
1. weight的量化方法是: Real zero 由 Q value 0表示，而激活值的 Real zero是由 Z = Qmax - Rmax/S 计算得到
2. 定点的范围也不同weight: [-127, 127], active value:[-128, 127]

对量化是阈值的选取计算，这篇文章采用了更简单的方法。
对于权重，使用实际的最大和最小值来决定量化参数。
对于激活输出，使用跨批（batches）的最大和最小值的滑动平均值来决定量化参数。

生成 float tflite model/ hybrid quatization and integer quantization

下面用简单的例子演示怎样生成 float tflite mode(no quantization), hybrid post training quatization and post-training integer quantization.

生成简单的mnist模型

使用的tensorfow版本为2.0.0

import tensorflow as tf
import numpy as np

print (tf.__version__)

(x_train, y_train),(x_test, y_test) = tf.keras.datasets.mnist.load_data()
#x_train, x_test = (x_train / 255.0), (x_test / 255.0)
x_train, x_test = (x_train / 255.0).astype(np.float32), (x_test / 255.0).astype(np.float32)

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(x_train, y_train, epochs=1)
model.evaluate(x_test, y_test)

保存为 saved_model and h5格式

model.save('saved_model')
model.save('keral_model.h5')
用的竟然是同一个函数save, 可能是从后面的参数文件夹还是文件区分的

生成tf.lite.TFLiteConverter的方法

在介绍生成quantization之前，先介绍下生成TFLiteConverter的方法。

The Python API for converting TensorFlow models to TensorFlow Lite is tf.lite.TFLiteConverter. TFLiteConverter provides the following classmethods to convert a model based on the original model format: