使用tflite（一）简介

最新推荐文章于 2024-04-25 09:30:35 发布

一笑

最新推荐文章于 2024-04-25 09:30:35 发布

阅读量3.3k

点赞数 1

分类专栏： tensorflow 笔记文章标签： tensorflow 深度学习

本文链接：https://blog.csdn.net/li_xiaolaji/article/details/122342331

版权

笔记同时被 2 个专栏收录

8 篇文章 0 订阅

订阅专栏

tensorflow

3 篇文章 0 订阅

订阅专栏

概述

量化

tflite的量化并不是全程使用uint8计算。而是存储每层的最大和最小值，然后把这个区间线性分成 256 个离散值，于是此范围内的每个浮点数可以用八位 (二进制) 整数来表示，近似为离得最近的那个离散值。比如，最小值是 -3 而最大值是 6 的情形，0 字节表示 -3，255 表示 6，而 128 是 1.5。
模型量化主要包括两个部分，一是针对权重Weight量化，一是针对激活值Activation量化，在一些文章中已经表明了将权重和激活值量化到8bit时就可以等价32bit的性能。在神经网络中的基本操作就是权重和激活值的卷积、乘加操作，W∗A
如果将其中一项量化到{-1,1}，那么就将乘加操作简化为了加减操作，如果两项都量化到{-1,1}，乘加操作就简化为了按位操作，
指为了达到减小模型大小、减小推理时内存占用和加快模型的推理速度等目的，将训练得到的连续取值的浮点数类型的权重转换为整形存储(一般值int8)。

为什么需要量化:

随着深度学习的发展，模型变得越来越庞大，这就非常不利于将模型应用到一些低成本的嵌入式系统的情况。为了解决该问题，模型量化应运而生。目的就是在损失少量精度的情况下对模型进行压缩，使模型可以应用到像手机、摄像头、机器人等嵌入设备中。

tflite 做的优化（针对安卓、移动设备）

用Flatbuffer序列化模型文件，这种格式磁盘占用少，加载快
量化。这个特性是可以开关的，可以把float参数量化为uint8类型，模型文件更小、计算更快。
剪枝、结构合并和蒸馏， tflite在转换时确实进行了结构调整。但是这个特性没有很多人讲，有可能这方面tflite做的不是很多。
对NNAPI的支持。上三个特性都是转换模型文件的特性，这个是运行时的特性。也就是调用安卓底层的接口，把异构的计算能力利用起来。

转换样例

TensorFlow 2.x 模型使用 SavedModel 格式存储，并使用高级tf.keras.API（Keras 模型）或低级tf.API（从中生成具体函数）生成。*

import tensorflow as tf
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

tf.lite.TFLiteConverter.from_keras_model()：转换 Keras模型。

mport tensorflow as tf


# Create a model using high-level tf.keras.* APIs
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1]),
    tf.keras.layers.Dense(units=16, activation='relu'),
    tf.keras.layers.Dense(units=1)
])
model.compile(optimizer='sgd', loss='mean_squared_error') # compile the model
model.fit(x=[-1, 0, 1], y=[-3, -1, 1], epochs=5) # train the model
# (to generate a SavedModel) tf.saved_model.save(model, "saved_model_keras_dir")


# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model)
#converter.post_training_quantize = True#量化转换
tflite_model = converter.convert()


# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

tf.lite.TFLiteConverter.from_concrete_functions(): 转换具体函数。

import tensorflow as tf


# Create a model using low-level tf.* APIs
class Squared(tf.Module):
  @tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.float32)])
  def __call__(self, x):
    return tf.square(x)
model = Squared()
# (ro run your model) result = Squared(5.0) # This prints "25.0"
# (to generate a SavedModel) tf.saved_model.save(model, "saved_model_tf_dir")
concrete_func = model.__call__.get_concrete_function()


# Convert the model.
# Notes that for the versions earlier than TensorFlow 2.7, the
# from_concrete_functions API is able to work when there is only the first
# argument given:
# > converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func],
                                                            model)
tflite_model = converter.convert()


# Save the model.
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

命令行工具

转换 SavedModel

tflite_convert \
  --saved_model_dir=/tmp/mobilenet_saved_model \
  --output_file=/tmp/mobilenet.tflite

转换 Keras H5 模型

tflite_convert \
  --keras_model_file=/tmp/mobilenet_keras_model.h5 \
  --output_file=/tmp/mobilenet.tflite

部署Linux平台

在 Linux 平台（包括Raspberry Pi）上，您可以使用C++和Python 中可用的 TensorFlow Lite API 运行推理，如以下部分所示。

详情查看tensorflow官方文档：

> https://tensorflow.google.cn/lite/guide/inference?hl=en

tflite详细使用请查看（我使用的tf2.5）

> https://blog.csdn.net/li_xiaolaji/article/details/122342788

一笑

关注

1
点赞
踩
14

收藏

觉得还不错? 一键收藏
0
评论
使用tflite（一）简介

tensorflow、tflite、模型量化
复制链接

扫一扫

专栏目录