tensorflow模型量化篇（2）全整形量化及半浮点数量化、量化感知训练

little student

已于 2022-03-28 19:37:09 修改

阅读量3.6k

点赞数 9

分类专栏：笔记模型压缩优化文章标签： tensorflow 深度学习神经网络 python

于 2021-03-18 21:24:42 首次发布

本文链接：https://blog.csdn.net/weixin_43490422/article/details/114988717

版权

笔记同时被 2 个专栏收录

38 篇文章 3 订阅

订阅专栏

模型压缩优化

3 篇文章 0 订阅

订阅专栏

文章目录

1 全整形量化（Full integer quantization）
2 量化感知训练
章节导航
- 上一篇：[tensorflow模型量化篇（1）量化方法及动态范围量化](https://blog.csdn.net/weixin_43490422/article/details/114961890)
- 下一篇：待续

1 全整形量化（Full integer quantization）

在模型转换时将权重张量以及激活张量从32位浮点数量化为8bit整数

1.1 训练一个keras模型并转换为tflite格式

#数据预处理
train_images = train_images / 255.0
test_images = test_images / 255.0

#构建模型
model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10, activation=tf.nn.softmax)
])

# 编译并训练
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(
  train_images, train_labels,
  epochs=5, validation_split=0.1,
)

Epoch 1/5
1688/1688 [==============================] - 8s 2ms/step - loss: 0.5397 - accuracy: 0.8512 - val_loss: 0.1348 - val_accuracy: 0.9643
Epoch 2/5
1688/1688 [==============================] - 4s 2ms/step - loss: 0.1416 - accuracy: 0.9593 - val_loss: 0.0937 	- val_accuracy: 0.9738
Epoch 3/5
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0920 - accuracy: 0.9720 - 	val_loss: 0.0759 - val_accuracy: 0.9797
Epoch 4/5
1688/1688 [==============================] - 4s 2ms/step - loss: 0.0780 - accuracy: 0.9774 - val_loss: 0.0735 - val_accuracy: 0.9805
Epoch 5/5
1688/1688 [==============================] - 4s 3ms/step - loss: 0.0620 - accuracy: 0.9820 - val_loss: 0.0651 - val_accuracy: 0.9828
<tensorflow.python.keras.callbacks.History at 0x7fced5573490>

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
tflite_name = "tflite_model"
open(tflite_name, "wb").write(tflite_model)

1.2 使用浮点回退量化（float fallback quantization）

为了量化变量（如输入、输出以及一些中间层的数据），我们需要一个RepresentativeDataset来代表这些数据的分布特征，如最大值最小值。
可以从训练集或验证集中选取大约100-500个数据。

def representative_data_gen():
    for image in train_images[0:100,:,:]:
        yield[image.reshape(-1,train_images.shape[1],train_images.shape[2]).astype("float32")]
 
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen

tflite_model_quant = converter.convert()
#保存转换后的模型
FullInt_name = "quantify_Full.tflite"
open(FullInt_name, "wb").write(tflite_model_quant)

使用转换后的tf lite 模式的模型进行推断查看效果：

def evaluate(interpreter_path):
    #加载模型并分配张量
    interpreter = tf.lite.Interpreter(model_path=interpreter_path)
    interpreter.allocate_tensors()

    #获得输入输出张量.
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    import numpy as np
    index = input_details[0]['index']
    shape = input_details[0]['shape']
    acc_count = 0
    image_count = test_images.shape[0]
    for i in range(image_count):
        interpreter.set_tensor(index, test_images[i].reshape(shape).astype("float32"))
        interpreter.invoke()
        output_data = interpreter.get_tensor(output_details[0]['index'])
        label = np.argmax(output_data)
        if label == test_labels[i]:
            acc_count += 1
    print("test_images accuracy is {:.2%}".format(acc_count/(image_count)))

evaluate(tflite_name)
evaluate(FullInt_name)

test_images accuracy is 98.02%
test_images accuracy is 97.94%

大小从原来的83640b减小到23840b，也是大约4倍的缩减，精度下降了0.08%。

至此，模型中权重和激活值被量化为了8bit，但是为了保持兼容性，这种方式的量化里输入和输出张量仍是float32类型。
如果TensorFlow Lite没有包含某个操作的量化实现，此量化过程可能会留下浮点格式的操作，这也就是浮点回退量化的名字的原因。

1.3 仅有integer的量化（integer-only quantization）

此方法使得所有的张量都被量化为8bit，如果不能被顺利执行，就会抛出异常。
实现这种方法的步骤很简单，只需要在1.2的基础上增添几行代码即可。

def representative_data_gen():
    for image in train_images[0:100,:,:]:
        yield[image.reshape(-1,train_images.shape[1],train_images.shape[2]).astype("float32")]
 
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen

#--------新增加的代码--------------------------------------------------------
# 确保量化操作不支持时抛出异常
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# 设置输入输出张量为uint8格式
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
#----------------------------------------------------------------------------

tflite_model_quant = converter.convert()
#保存转换后的模型
FullInt_name = "quantify_Full.tflite"
open(FullInt_name, "wb").write(tflite_model_quant)

这种方法的效果依据1.2中的步骤可以自行测试。

1.4 半浮点数量化（float16 quantization）

将量化方式改为float16 量化较为简单，只需要在1.2的基础上增加一行代码

def representative_data_gen():
    for image in train_images[0:100,:,:]:
        yield[image.reshape(-1,train_images.shape[1],train_images.shape[2]).astype("float32")]
 
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen

#--------增加的代码--------------------------------------------------------
converter.target_spec.supported_types = [tf.float16]
#----------------------------------------------------------------------------

tflite_model_quant = converter.convert()
#保存转换后的模型
FullInt_name = "quantify_Full.tflite"
open(FullInt_name, "wb").write(tflite_model_quant)

结果对比如下：

83640
43488
test_images accuracy is 98.02%
test_images accuracy is 98.02%

可以看出模型缩小为原来的1/2，而准确率没有下降。

1.5 8bit权重16bit激活（integer quantization with int16 activations）

def representative_data_gen():
    for image in train_images[0:100,:,:]:
        yield[image.reshape(-1,train_images.shape[1],train_images.shape[2]).astype("float32")]
 
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen

#--------增加的代码--------------------------------------------------------
converter.target_spec.supported_ops = [tf.lite.OpsSet.EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8]
#----------------------------------------------------------------------------

tflite_model_quant = converter.convert()
#保存转换后的模型
FullInt_name = "quantify_Full.tflite"
open(FullInt_name, "wb").write(tflite_model_quant)

84684
25008
test_images accuracy is 98.02%
test_images accuracy is 98.02%

注：此方法仍在实验当中，如果报错提示没有EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8属性，请更新你的tensorflow 版本，实验环境此时为tensorflow == 2.4.1

2 量化感知训练

代码流程与上述流程并无太大差异，具体参考量化感知训练

章节导航

上一篇：tensorflow模型量化篇（1）量化方法及动态范围量化

下一篇：待续

little student

关注

9
点赞
踩
37

收藏

觉得还不错? 一键收藏
打赏
6
评论
tensorflow模型量化篇（2）全整形量化及半浮点数量化、量化感知训练

文章目录1 全整形量化（Full integer quantization）1.1 训练一个keras模型并转换为tflite格式1.2 使用浮点回退量化（float fallback quantization）1.3 仅有integer的量化（integer-only quantization）1 全整形量化（Full integer quantization）在模型转换时将权重张量以及激活张量从32位浮点数量化为8bit整数1.1 训练一个keras模型并转换为tflite格式#数据预处理tr
复制链接

扫一扫