openvino系列 1. 模型加载与推理

最新推荐文章于 2024-06-13 15:14:15 发布

破浪会有时

最新推荐文章于 2024-06-13 15:14:15 发布

阅读量5.4k

点赞数 8

分类专栏： openvino案例分析文章标签： openvino 机器学习

本文链接：https://blog.csdn.net/zyctimes/article/details/124430369

版权

openvino案例分析专栏收录该内容

20 篇文章 17 订阅

订阅专栏

openvino系列 1. 模型加载与推理

环境描述：

本案例运行环境：Win10
IDE：VSCode
openvino版本：2022.1
代码链接，1-openvino-basicworkflow文件夹

文章目录

openvino系列 1. 模型加载与推理

1 初始化

在开始之前，我们需要初始化推理引擎，并且查看主机的CPU/GPU信息（以此选择我们接下来推理使用的媒介）。

推理引擎可以在设备上加载网络。在这种情况下，设备是指 CPU、Intel GPU、Neural Compute Stick 2 等。ie.available_devices 属性显示了系统上可用的设备。 ie.get_property() 的 FULL_DEVICE_NAME 选项显示设备的名称。

在此次案例中，我们使用了CPU设备。如果要使用集成GPU，需要修改 device_name="GPU"。注意，在GPU上加载网络会比在CPU上加载网络慢，但推理可能会更快。

当我们运行下面代码后，我们就可以看到自己运行的主机上所有的CPU与GPU设备信息，比如：

CPU: Intel® Core™ i5-10210U CPU @ 1.60GHz
GPU: Intel® UHD Graphics (iGPU)

from openvino.runtime import Core
ie = Core()
print("使用Core()初始化推理引擎: {}".format(ie))

print("显示主机CPU/GPU设备信息。")
devices = ie.available_devices
for device in devices:
    device_name = ie.get_property(device_name=device, name="FULL_DEVICE_NAME")
    print(f"{device}: {device_name}")

2 模型加载

这里会涉及到openvino支持的IR模型加载，以及ONNX模型转化成IR模型，再进行加载。

初始化推理引擎后，首先用read_model()读取模型文件，然后用compile_model()编译到指定的设备。

2.1 IR 模型

IR（中间表示）模型由一个.xml文件（包含有关网络拓扑的信息）和一个.bin文件（包含权重和偏差二进制数据）组成。 read_model()函数会读取IR模型。我们一般这两个文件（.xml和.bin）放于同一目录中，并且具有相同的文件名。

我们之后的案例会详细解释如何转换TensorFlow, PyTorch以及ONNX格式模型到IR格式。要将 ONNX 模型导出到具有默认设置的 IR，也可以使用 .serialize() 方法。

在此案例中，我们先导入一个已经转化好的classification IR 模型。然后读取模型并编译。

如果我们打印model，结果如下：

<Model: 'v3-small_224_1.0_float'
inputs[
<ConstOutput: names[input:0, input] shape{1,3,224,224} type: f32>
]
outputs[
<ConstOutput: names[MobilenetV3/Predictions/Softmax] shape{1,1001} type: f32>
]>

如果我们打印compiled_model，结果如下：

<CompiledModel:
inputs[
<ConstOutput: names[input:0, input] shape{1,3,224,224} type: f32>
]
outputs[
<ConstOutput: names[MobilenetV3/Predictions/Softmax] shape{1,1001} type: f32>
]>

我们会看到这个编译完模型的输入输出。代码如下：

from openvino.runtime import Core

ie = Core()
classification_model_xml = "model/classification.xml"

model = ie.read_model(model=classification_model_xml)
compiled_model = ie.compile_model(model=model, device_name="CPU")

2.2 ONNX 模型

ONNX模型是单个文件。读取和加载ONNX模型与读取和加载IR模型的工作方式相同。model 参数指向 ONNX 文件名。我们可以使用.serialize()将ONNX模型导出到IR。相关代码如下：

from openvino.runtime import Core
from openvino.offline_transformations import serialize

ie = Core()
onnx_model_path = "model/segmentation.onnx"
model_onnx = ie.read_model(model=onnx_model_path)
compiled_model_onnx = ie.compile_model(model=model_onnx, device_name="CPU")
# 使用 .serialize() 将 ONNX 模型导出到 IR
serialize(model=model_onnx, model_path="model/exported_onnx_model.xml", weights_path="model/exported_onnx_model.bin")

2.3 查看模型的输入输出信息

OpenVINO IENetwork 实例存储有关模型的信息。关于模型的输入和输出的信息在 model.inputs 和 model.outputs 中。这些也是 ExecutableNetwork 实例的属性。我们在下面的单元格中使用 model.inputs 和 model.outputs，也可以使用 compiled_model.inputs 和 compiled_model.outputs。

模型输入：下面的代码显示加载的模型输入层的信息。如果您加载了不同的模型，您可能会看到不同的输入层名称，并且您可能会看到更多输入。这里我们引用第一个输入层：model.input(0)。运行下面代码后，我们看到模型需要的输入形状为[1,3,224,224]，即NCHW：该模型期望输入数据的批大小（N）为 1、3 个通道（C），图像的高度（H）和宽度（W）为 224。输入数据为 FP32（浮点）精度。

from openvino.runtime import Core

ie = Core()
classification_model_xml = "model/classification.xml"
model = ie.read_model(model=classification_model_xml)
#model.input(0).any_name
input_layer = model.input(0)
print('Model Input Info')
print(input_layer)
print(f"input precision: {input_layer.element_type}")
print(f"input shape: {input_layer.shape}")

模型输出：模型输出信息存储在 model.outputs 中。上面的单元格显示模型返回一个输出，名称为 _MobilenetV3/Predictions/Softmax_。如果我们加载了不同的模型，可能会看到不同的输出层名称和输出。我们可以从模型输出显示看到，模型返回形状为[1, 1001]，其中 1 是批大小 (N)，1001 是类 (Class: C)。输出数据为 FP32（浮点）精度。

from openvino.runtime import Core

ie = Core()
classification_model_xml = "model/classification.xml"
model = ie.read_model(model=classification_model_xml)
#print(model.output(0).any_name)
output_layer = model.output(0)
print('Model Output Info')
print(output_layer)
print(f"output precision: {output_layer.element_type}")
print(f"output shape: {output_layer.shape}")

3 模型推理

要对模型进行推理，首先需要通过调用 ExecutableNetwork 的方法 create_infer_request() 来创建推理请求，我们使用 compile_model() 加载了 exec_net。然后我们必须调用 infer()，作为 _InferRequest_ 的方法，需要一个参数：inputs。这是一个字典，将输入层名称映射到输入数据。

步骤1：导入模型。我们通过ie.read_model来读取模型，ie.compile_model来编译模型。
步骤2：加载图像并转换为输入形状。要通过网络传播图像，需要将其加载到数组中，调整为网络期望的形状，并转换为网络的输入布局格式。比如在这个案例，图像的形状为 (663,994,3)。它的高度为 663 像素，宽度为 994 像素，并具有 3 个颜色通道。我们得到网络期望的高度和宽度的参考，并将图像调整为该大小。最后，我们将图像尺寸更改为 N、C、H、W 格式（其中 N=1），首先调用 np.transpose() 更改为 C、H、W，然后通过调用 np.expand_dims() 添加 N 维 。使用 np.astype() 将数据转换为 FP32。
步骤3：模型推理。我们可以通过使用compiled_model([input_data])[output_layer]直接得到推理的结果。或者我们还可以创建 InferRequest 并根据请求运行 infer 方法。输出result的形状为 (1,1001)，这和output_layer的尺寸是一致的。此输出形状表明网络返回1001个类别的概率。最后，我们需要通过argmax找到对应的类。

from openvino.runtime import Core
import cv2
import numpy as np

print("1 load network")
ie = Core()
classification_model_xml = "model/classification.xml"
model = ie.read_model(model=classification_model_xml)
compiled_model = ie.compile_model(model=model, device_name="CPU")
input_layer = compiled_model.input(0)
output_layer = compiled_model.output(0)
print("- model input: {}".format(input_layer))
print("- model output: {}".format(output_layer))

print("2 load image")
image_filename = "data/coco_hollywood.jpg"
image = cv2.imread(image_filename)
print("- input image shape: {}".format(image.shape))
# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = input_layer.shape
# OpenCV resize expects the destination size as (width, height)
resized_image = cv2.resize(src=image, dsize=(W, H))
print("- resize image into shape: {}".format(resized_image.shape))
input_data = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0).astype(np.float32)
print("- align image shape same as network input: {}".format(input_data.shape))

print("3 inference")
result = compiled_model([input_data])[output_layer]
print("- Inference result example, result[0]：{}".format(result[0]))
print("- Inference result shape: {}".format(result.shape))
print("- Inference result[0] shape: {}".format(result[0].shape))
# 我们还可以创建 `InferRequest` 并根据请求运行 `infer` 方法。
# `.infer()` 设置输出张量，我们可以使用 `get_output_tensor()` 来达到。 由于我们知道这个网络返回一个输出，
# 并且我们将输出层的引用存储在 `output_layer.index` 参数中，我们可以使用 `request.get_output_tensor(output_layer.index)` 
# 获取数据。 要从输出中获取 numpy 数组，我们需要使用参数 `.data`。
request = compiled_model.create_infer_request()
request.infer(inputs={input_layer.any_name: input_data})
result = request.get_output_tensor(output_layer.index).data

result_index = np.argmax(result)
print("- Inference result is classified as class index: {}".format(result_index))
# 将推理结果转换为类名
imagenet_classes = open("utils/imagenet_2012.txt").read().splitlines()
# 模型描述指出，需要添加0类：背景。
imagenet_classes = ['background'] + imagenet_classes
print("- Inference result is classified as class: {}".format(imagenet_classes[result_index]))

运行后的结果：

1 load network
- model input: <ConstOutput: names[input:0, input] shape{1,3,224,224} type: f32>
- model output: <ConstOutput: names[MobilenetV3/Predictions/Softmax] shape{1,1001} type: f32>
2 load image
- input image shape: (663, 994, 3)
- resize image into shape: (224, 224, 3)
- align image shape same as network input: (1, 3, 224, 224)
3 inference
- Inference result example, result[0]：[1.9758532e-04 5.8727856e-05 6.4592139e-05 ... 4.0716583e-05 1.7331535e-04
 1.3031592e-04]
- Inference result shape: (1, 1001)
- Inference result[0] shape: (1001,)
- Inference result is classified as class index: 198
- Inference result is classified as class: n02097130 giant schnauzer