openvino系列 10. Model Optimizer：TensorFlow pb 模型转化 IR 模型

最新推荐文章于 2024-07-08 08:22:55 发布

破浪会有时

最新推荐文章于 2024-07-08 08:22:55 发布

阅读量2.2k

点赞数 4

分类专栏： openvino案例分析文章标签： openvino 机器学习

本文链接：https://blog.csdn.net/zyctimes/article/details/124512813

版权

openvino案例分析专栏收录该内容

20 篇文章 18 订阅

订阅专栏

openvino系列 10. Model Optimizer：TensorFlow pb 模型转化 IR 模型

本章节将介绍 OpenVINO Model Optimizer 模块，以及如何将TensorFlow pb 模型转化为 IR 模型（mo --input_model <INPUT_MODEL>.pb）。

环境描述：

本案例运行环境：Win10，10代i5笔记本
IDE：VSCode
openvino版本：2022.1
代码链接，4-model-optimizer-convert2IR

文章目录

openvino系列 10. Model Optimizer：TensorFlow pb 模型转化 IR 模型

1 Model Optimizer 介绍

OpenVINO中模型优化器(Model Optimizer)支持tensorflow/Caffe模型转换为OpenVINO的中间层表示IR(intermediate representation)，从而实现对模型的压缩与优化，方便推断引擎更快的加载与执行这些模型。

下面这张图说明了部署训练有素的深度学习模型的典型工作流程：

在这里插入图片描述

关于IR模型：IR模型是OpenVINO的中间层表示，由一个.xml文件（包含有关网络拓扑的信息）和一个.bin文件（包含权重和偏差二进制数据）组成。 read_model()函数会读取IR模型。我们一般这两个文件（.xml和.bin）放于同一目录中，并且具有相同的文件名。

从上图中，我们可以很清晰地看到OpenVINO的整体工作流程：我们可以用过TensorFlow/PyTorch/PaddlePaddle训练完一个模型，但这个模型没有办法直接用于OpenVINO的模型推理。需要先把这些模型转化为IR中间件，通过模型优化器(Model Optimizer)。最后经过推理后，应用于User Applications。注意：生成的IR可以通过应用训练后量化方法（POT）进行额外的推理优化（参见案例5-pot-int8-simplifiedmode，6_pot_objectdetection，以及相关）。

2 TensorFlow `pb` 模型转化 IR 模型

这个案例展示了如何将 TensorFlow MobileNetV3 图像分类模型转换为 OpenVINO 的 IR 模型。创建 IR 后，我们将模型加载到 OpenVINO 的推理引擎并使用示例图像执行推理。

2.1 关于导入的模型

mobilenet-v3-small-1.0-224-tf 是 MobileNets V3 之一，基于互补搜索技术的组合以及新颖的架构设计（论文地址）。 mobilenet-v3-small-1.0-224-tf模型主要针对低资源用例。关于此模型的官方介绍见此链接。

mobilenet-v3-small-1.0-224-tf的主要特性罗列如下：

图像分类模型；
模型输入：[1,224,224,3]，对应[B,H,W,C]，即[batch size,image height,image width,channels]
模型输出：[1,1000]，对应[B,C]，即[batch size,Predicted probabilities for each class in [0, 1] range]
Color order：RGB。

2.2 TensorFlow模型转换

我们调用 OpenVINO 模型优化器工具（Model Optimizer）将 TensorFlow 模型转换为具有 FP16 精度的 OpenVINO IR。模型保存到Model文件夹下。我们将平均值添加到模型中，并使用--scale_values使用标准偏差缩放输出。使用这些选项，无需在通过网络传播输入数据之前对其进行规范化。原始模型需要 RGB 格式的输入图像。转换后的模型还需要 RGB 格式的图像。如果我们希望转换后的模型与 BGR 图像一起使用，可以使用 --reverse-input-channels 选项。

我们首先为模型优化器构建命令，然后在笔记本中通过在命令前加上“！”来执行此命令。输出中可能有一些错误或警告。如果输出的最后几行包含“[SUCCESS] Generated IR version 11 model.”，则模型优化成功。

半精度模型（FP16）：我们可以在转化的过程中将TensorFlow模型转换成FP16精度的IR模型。在指令中对应--data_type选项，比如：mo --input_model INPUT_MODEL --data_type FP16。半精度模型大小应只有全精度模型的一般，但它可能会有一些精度下降，尽管对于大多数模型来说，精度下降可以忽略不计。

设置Layout：Layout定义了模型的形状尺寸，并且可以为设定输入模型的Layout和经过转换之后的IR输出模型的Layout，比如：mo --input_model tf_nasnet_large.onnx --layout "nhwc->nchw"，或者我们只定义一个Layout：mo --input_model tf_nasnet_large.onnx --layout nhwc。

设置Mean和Scale：通常使用归一化的输入数据训练神经网络模型。这意味着将输入数据值转换为特定范围内，例如 [0, 1] 或 [-1, 1]。有时，作为预处理的一部分，我们从输入数据值中减去平均值。输入数据预处理的实现方式有两种：

输入预处理操作是模型的一部分。在这种情况下，应用程序不会将输入数据作为单独的步骤进行预处理：所有内容都嵌入到模型本身中。
输入预处理操作不是模型的一部分，预处理是在为模型提供输入数据的应用程序中执行的（我们这个案例的情况）。

在第一种情况下，模型优化器生成具有所需预处理操作的 IR，并且不需要Mean和Scale参数。在第二种情况下，应向模型优化器提供有关Mean和Scale值的信息，以将其嵌入到生成的 IR 中。我们可以在命令中使用如下参数：

--mean_values
--scale_values
--scale

一个例子：mo --input_model unet.pdmodel --mean_values [123,117,104] --scale 255。

修改输入通道：有时，您的应用程序的输入图像可以是 RGB (BGR) 格式，并且模型在 BGR (RGB) 格式的图像上进行训练，颜色通道顺序相反。在这种情况下，重要的是通过在推理之前恢复颜色通道来预处理输入图像。为了将此预处理步骤嵌入到 IR 中，模型优化器提供了 --reverse_input_channels 命令行参数来修改颜色通道。

2.3 结果

首先，我们导入IR模型，并且进行模型推理得到结果：

print("1. Load Model")
ie = Core()
model = ie.read_model(model=ir_path, weights=ir_path.with_suffix(".bin"))
compiled_model = ie.compile_model(model=model, device_name="CPU")
input_key = compiled_model.input(0)
output_key = compiled_model.output(0)
network_input_shape = input_key.shape 
print("- Model Input: {}".format(input_key))
print("- Model Output: {}".format(output_key))
print("2. Load an image, resize it, and convert it to the input shape of the network.")
# The MobileNet network expects images in RGB format
image = cv2.cvtColor(cv2.imread(filename="data/coco.jpg"), code=cv2.COLOR_BGR2RGB)
print("- Original image shape: {}".format(image.shape))
# Resize image to network input image shape
resized_image = cv2.resize(src=image, dsize=(224, 224))
# Transpose image to network input shape
input_image = np.expand_dims(resized_image, 0)
print("- Image is resized into: {}".format(input_image.shape))
print("1. Model Inference")
result = compiled_model([input_image])[output_key]
result_index = np.argmax(result)
# Convert the inference result to a class name.
imagenet_classes = open("utils/imagenet_2012.txt").read().splitlines()
# The model description states that for this model, class 0 is background,
# so we add background at the beginning of imagenet_classes
imagenet_classes = ['background'] + imagenet_classes
print("- Classification result: {}".format(imagenet_classes[result_index]))
plt.imshow(image)

Terminal中的记录：

1. Load Model
- Model Input: <ConstOutput: names[input, input:0] shape{1,224,224,3} type: f32>
- Model Output: <ConstOutput: names[MobilenetV3/Predictions/Softmax:0] shape{1,1001} type: f32>
2. Load an image, resize it, and convert it to the input shape of the network.
- Original image shape: (577, 800, 3)
- Image is resized into: (1, 224, 224, 3)
1. Model Inference
- Classification result: n02099267 flat-coated retriever

使用benchmark_app测试IR模型性能

为了测量 FP16 IR 模型的推理性能，我们使用 OpenVINO 的 Benchmark Tool。可以在笔记本中运行：！benchmark_app 或 %sx benchmark_app。

注意：为了获得最准确的性能估计，我们建议在关闭其他应用程序后在终端/命令提示符下运行 benchmark_app。运行 benchmark_app --help 以查看所有命令行选项。

# Benchmark FP16 model
!benchmark_app -m $ir_path -d CPU -api async -t 15 -b 1

Terminal返回：

Output exceeds the size limit. Open the full output data in a text editor
[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device CPU performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read model took 31.01 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'input' precision u8, dimensions ([N,H,W,C]): 1 224 224 3
[ INFO ] Model output 'MobilenetV3/Predictions/Softmax:0' precision f32, dimensions ([...]): 1 1001
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 199.00 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: CPU
[ INFO ]   AVAILABLE_DEVICES  , ['']
...
    AVG:        5.12 ms
    MIN:        3.12 ms
    MAX:        30.85 ms
Throughput: 768.31 FPS