openvino系列 11. Model Optimizer：PyTorch pt 模型与 ONNX 模型转化 IR 模型案例

最新推荐文章于 2025-03-14 16:09:44 发布

破浪会有时

最新推荐文章于 2025-03-14 16:09:44 发布

阅读量2.3k

点赞数 1

分类专栏： openvino案例分析文章标签： openvino 机器学习

本文链接：https://blog.csdn.net/zyctimes/article/details/124514756

版权

openvino案例分析专栏收录该内容

20 篇文章

订阅专栏

本文详细介绍了如何使用OpenVINO的ModelOptimizer将PyTorch模型转换为ONNX，然后进一步转化为IR模型。通过案例展示了cifar10_resnet20模型的转化过程，并探讨了模型优化、布局、精度和预处理等关键步骤。最后，利用IR模型进行了推理并测试了性能。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

openvino系列 11. Model Optimizer：PyTorch pt 模型与 ONNX 模型转化 IR 模型

本章节将介绍 OpenVINO Model Optimizer 模块，以及如何将PyTorch pt 模型与 ONNX 模型模型转化为 IR 模型。

环境描述：

本案例运行环境：Win10，10代i5笔记本
IDE：VSCode
openvino版本：2022.1
代码链接，4-model-optimizer-convert2IR

文章目录

openvino系列 11. Model Optimizer：PyTorch pt 模型与 ONNX 模型转化 IR 模型

1 Model Optimizer 介绍

OpenVINO中模型优化器(Model Optimizer)支持tensorflow/Caffe模型转换为OpenVINO的中间层表示IR(intermediate representation)，从而实现对模型的压缩与优化，方便推断引擎更快的加载与执行这些模型。

下面这张图说明了部署训练有素的深度学习模型的典型工作流程：

在这里插入图片描述

关于IR模型：IR模型是OpenVINO的中间层表示，由一个.xml文件（包含有关网络拓扑的信息）和一个.bin文件（包含权重和偏差二进制数据）组成。 read_model()函数会读取IR模型。我们一般这两个文件（.xml和.bin）放于同一目录中，并且具有相同的文件名。

从上图中，我们可以很清晰地看到OpenVINO的整体工作流程：我们可以用过TensorFlow/PyTorch/PaddlePaddle训练完一个模型，但这个模型没有办法直接用于OpenVINO的模型推理。需要先把这些模型转化为IR中间件，通过模型优化器(Model Optimizer)。最后经过推理后，应用于User Applications。注意：生成的IR可以通过应用训练后量化方法（POT）进行额外的推理优化（参见案例5-pot-int8-simplifiedmode，6_pot_objectdetection，以及相关）。

2 PyTorch `pt` 模型与 ONNX 模型转化 IR 模型

此案例使用 OpenVINO 推理引擎对 PyTorch 语义分割模型执行推理。

首先，我们需要将 PyTorch 模型无法直接转化为 OpenVINO 中间表示 (IR)，所以需要先将其转换为 ONNX 模型，然后再转化为 IR模型。最后我们将 IR 模型加载到 OpenVINO 推理引擎中以显示模型预测。

2.1 关于模型

这里我们选择了一个分类模型：cifar10_resnet20。模型的详细描述参见这里。

下面为相关的一些参数：

ConfigTree([('type_', 'cifar10'),
    ('image_size', 32),
    ('num_classes', 10),
    ('root', 'data/cifar10'),
    ('mean', [0.4914, 0.4822, 0.4465]),
    ('std', [0.2023, 0.1994, 0.201]),
    ('batch_size', 256),
    ('num_workers', 4)])),

2.2 模型转化

3.2 模型转化

首先我们加载cifar10_resnet20这个模型：torch.hub.load("chenyaofo/pytorch-cifar-models", "cifar10_resnet20", pretrained=True)。然后将其转化成ONNX格式：torch.onnx.export(model, dummy_input, onnx_model_path)。最后再从ONNX转化到IR模型。

实际上，mo这条指令还可以添加好些选项，我们虽然在这个案例中没有使用，但这里也做一个介绍：

半精度模型（FP16）：我们可以在转化的过程中将TensorFlow模型转换成FP16精度的IR模型。在指令中对应--data_type选项，比如：mo --input_model INPUT_MODEL --data_type FP16。半精度模型大小应只有全精度模型的一般，但它可能会有一些精度下降，尽管对于大多数模型来说，精度下降可以忽略不计。

设置Layout：Layout定义了模型的形状尺寸，并且可以为设定输入模型的Layout和经过转换之后的IR输出模型的Layout，比如：mo --input_model tf_nasnet_large.onnx --layout "nhwc->nchw"，或者我们只定义一个Layout：mo --input_model tf_nasnet_large.onnx --layout nhwc。

设置Mean和Scale：通常使用归一化的输入数据训练神经网络模型。这意味着将输入数据值转换为特定范围内，例如 [0, 1] 或 [-1, 1]。有时，作为预处理的一部分，我们从输入数据值中减去平均值。输入数据预处理的实现方式有两种：

输入预处理操作是模型的一部分。在这种情况下，应用程序不会将输入数据作为单独的步骤进行预处理：所有内容都嵌入到模型本身中。
输入预处理操作不是模型的一部分，预处理是在为模型提供输入数据的应用程序中执行的（我们这个案例的情况）。

在第一种情况下，模型优化器生成具有所需预处理操作的 IR，并且不需要Mean和Scale参数。在第二种情况下，应向模型优化器提供有关Mean和Scale值的信息，以将其嵌入到生成的 IR 中。我们可以在命令中使用如下参数：

--mean_values
--scale_values
--scale

一个例子：mo --input_model unet.pdmodel --mean_values [123,117,104] --scale 255。

修改输入通道：有时，您的应用程序的输入图像可以是 RGB (BGR) 格式，并且模型在 BGR (RGB) 格式的图像上进行训练，颜色通道顺序相反。在这种情况下，重要的是通过在推理之前恢复颜色通道来预处理输入图像。为了将此预处理步骤嵌入到 IR 中，模型优化器提供了 --reverse_input_channels 命令行参数来修改颜色通道。

import sys
import time
import os
from pathlib import Path

import cv2
import numpy as np
import torch
from IPython.display import Markdown, display
from fastseg import MobileV3Large
from openvino.runtime import Core

MODEL_DIR = 'model/pytorch'
MODEL_NAME = 'resnet20'

os.makedirs(MODEL_DIR, exist_ok=True)

model = torch.hub.load("chenyaofo/pytorch-cifar-models", "cifar10_resnet20", pretrained=True)
dummy_input = torch.randn(1, 3, 32, 32)

onnx_model_path = Path(MODEL_DIR) / '{}.onnx'.format(MODEL_NAME)
ir_path = Path(onnx_model_path).with_suffix(".xml")
torch.onnx.export(model, dummy_input, onnx_model_path)

# Convert this model into the OpenVINO IR using the Model Optimizer:
mo_command = f"""mo
                 --framework=onnx
                 --input_model "{onnx_model_path}" 
                 --input_shape "[1,3,32,32]"
                 --data_type FP32 
                 --output_dir "{MODEL_DIR}"
                 """
mo_command = " ".join(mo_command.split())
print("Model Optimizer command to convert ONNX to OpenVINO:")
display(Markdown(f"`{mo_command}`"))

# Run Model Optimizer if the IR model file does not exist
if not ir_path.exists():
    print("Exporting ONNX model to IR... This may take a few minutes.")
    ! $mo_command
else:
    print(f"IR model {ir_path} already exists.")

2.3 结果

首先，我们导入IR模型，并且进行模型推理得到结果：

import cv2
import matplotlib.pyplot as plt
import numpy as np
from openvino.runtime import Core

print("1 Load the model.")
ie = Core()
model = ie.read_model(model=ir_path)
compiled_model = ie.compile_model(model=model, device_name="CPU")
input_layer_ir = compiled_model.input(0)
output_layer_ir = compiled_model.output(0)
print("- Input layer info: {}".format(input_layer_ir))
print("- Output layer info: {}".format(output_layer_ir))
print("2 Load the image, and reshape to the same size as model input.")
# Text detection models expects image in BGR format
image = cv2.imread("data/coco.jpg")
print("- Image original shape: {0}".format(image.shape))
# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = input_layer_ir.shape
# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (W, H))
# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)
#plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
print("- Image size reshape into: {0}".format(input_image.shape))
print("3 Inference.")
# Create inference request
boxes = compiled_model([input_image])[output_layer_ir]
# Remove zero only boxes
boxes = boxes[~np.all(boxes == 0, axis=1)]
result_index = np.argmax(boxes)
print("- Shape of inference result: {0}".format(boxes.shape))
labels_names = ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]
print("- Final classification result: {0}".format(labels_names[result_index]))

Terminal返回：

1 Load the model.
- Input layer info: <ConstOutput: names[input.1] shape{1,3,32,32} type: f32>
- Output layer info: <ConstOutput: names[208] shape{1,10} type: f32>
2 Load the image, and reshape to the same size as model input.
- Image original shape: (577, 800, 3)
- Image size reshape into: (1, 3, 32, 32)
3 Inference.
- Shape of inference result: (1, 10)
- Final classification result: dog

使用benchmark_app测试IR模型性能

为了测量 FP16 IR 模型的推理性能，我们使用 OpenVINO 的 Benchmark Tool。可以在笔记本中运行：！benchmark_app 或 %sx benchmark_app。

注意：为了获得最准确的性能估计，我们建议在关闭其他应用程序后在终端/命令提示符下运行 benchmark_app。运行 benchmark_app --help 以查看所有命令行选项。

# Benchmark FP32 model
!benchmark_app -m $ir_path -d CPU -api async -t 15 -b 1

Terminal返回：

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device CPU performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read model took 11.96 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'input.1' precision u8, dimensions ([N,C,H,W]): 1 3 32 32
[ INFO ] Model output '208' precision f32, dimensions ([...]): 1 10
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 59.00 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: CPU
[ INFO ]   AVAILABLE_DEVICES  , ['']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 1, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 8)
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'FP16', 'INT8', 'BIN', 'EXPORT_IMPORT']
[ INFO ]   CACHE_DIR  , 
[ INFO ]   NUM_STREAMS  , 4
[ INFO ]   INFERENCE_NUM_THREADS  , 0
[ INFO ]   PERF_COUNT  , False
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 4 infer requests took 1.01 ms
[ WARNING ] No input files were given for input 'input.1'!. This input will be filled with random values!
[ INFO ] Fill input 'input.1' with random values 
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests using 4 streams for CPU, inference only: True, limits: 15000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 5.25 ms
[Step 11/11] Dumping statistics report
Count:          36872 iterations
Duration:       15001.09 ms
Latency:
    Median:     1.45 ms
    AVG:        1.59 ms
    MIN:        1.05 ms
    MAX:        52.39 ms
Throughput: 2457.96 FPS