openvino系列 17. OpenVINO Preprocessing API 案例，以及与OpenCV的预处理对比

破浪会有时

已于 2022-05-15 13:42:27 修改

阅读量1.4k

点赞数 2

分类专栏： openvino案例分析文章标签： opencv openvino python

于 2022-05-15 13:41:32 首次发布

本文链接：https://blog.csdn.net/zyctimes/article/details/124781354

版权

openvino案例分析专栏收录该内容

20 篇文章 17 订阅

订阅专栏

openvino系列 17. OpenVINO Preprocessing API 案例，以及与OpenCV的预处理对比

此案例，我们将详细介绍OpenVINO Preprocessing API，并与OpenCV的预处理结果做对比。案例涉及：

读取 ONNX 迁移学习模型
将 ONNX 模型转化为 IR 中间件
通过 OpenCV 导入图片，并进行预处理，最终模型推理。计算模型的FPS以及精度；
通过benchmark_app指令查看此IR模型的性能；
通过 OpenVINO Preprocessing API 将预处理步骤集成并保存到 IR 中；
对比 OpenVINO 预处理以及 OpenCV 预处理结果，包括速度和精度；
通过benchmark_app指令查看此 OpenVINO 预处理 IR模型的性能。

环境描述：

本案例运行环境：Win10，10代i5笔记本
IDE：VSCode
openvino版本：2022.1
代码链接，12-preprocessing

文章目录

openvino系列 17. OpenVINO Preprocessing API 案例，以及与OpenCV的预处理对比

1 OpenVINO Preprocessing API 介绍

OpenVINO 2022.1之前版本不提供 OpenVINO runtime原生的用于数据预处理的API函数，开发者必须通过第三方库，比如，OpenCV，来实现数据预处理。OpenVINO™ 2022.1自带的预处理API可以将所有预处理步骤都集成到在执行图中，这样iGPU、CPU、VPU 或今后Intel的独立显卡都能进行数据预处理，大大提高了执行效率；相比之前，用OpenCV实现的预处理，则只能在CPU上执行，如下图：

在这里插入图片描述

2 加载 ONNX 模型并转换为 IR 模型

这里我们使用的模型名为mobileNetV2-lego-minifigures.onnx，基于MobileNetV2进行迁移学习得到，MobileNetV2预训练模型地址：torch hub。此模型的迁移训练博客参见
MobileNetV2 pyTorch Lightning LEGO Minifigures 图像分类案例。

所有预训练模型都期望输入图像以相同的方式归一化，即形状为 (3 x H x W) 的3通道RGB图像，其中H和W为224。图像必须归一化到[0,1]的范围内，然后使用 mean=[0.485,0.456,0.406] 和 std=[0.229,0.224,0.225] 进行归一化。

首先，我们需要通过OpenVINO的mo指令将ONNX文件转化为IR中间件：mo --framework=onnx --input_model "model\mobileNetV2-lego-minifigures.onnx" --input_shape "[1,3,224,224]" --data_type FP32 --output_dir "model"。

实际上，mo这条指令还可以添加好些选项，我们虽然在这个案例中没有使用，但这里也做一个介绍：

半精度模型（FP16）：我们可以在转化的过程中将TensorFlow模型转换成FP16精度的IR模型。在指令中对应--data_type选项，比如：mo --input_model INPUT_MODEL --data_type FP16。半精度模型大小应只有全精度模型的一般，但它可能会有一些精度下降，尽管对于大多数模型来说，精度下降可以忽略不计。

设置Layout：Layout定义了模型的形状尺寸，并且可以为设定输入模型的Layout和经过转换之后的IR输出模型的Layout，比如：mo --input_model tf_nasnet_large.onnx --layout "nhwc->nchw"，或者我们只定义一个Layout：mo --input_model tf_nasnet_large.onnx --layout nhwc。

设置Mean和Scale：通常使用归一化的输入数据训练神经网络模型。这意味着将输入数据值转换为特定范围内，例如 [0, 1] 或 [-1, 1]。有时，作为预处理的一部分，我们从输入数据值中减去平均值。输入数据预处理的实现方式有两种：

输入预处理操作是模型的一部分。在这种情况下，应用程序不会将输入数据作为单独的步骤进行预处理：所有内容都嵌入到模型本身中。
输入预处理操作不是模型的一部分，预处理是在为模型提供输入数据的应用程序中执行的（我们这个案例的情况）。

在第一种情况下，模型优化器生成具有所需预处理操作的 IR，并且不需要Mean和Scale参数。在第二种情况下，应向模型优化器提供有关Mean和Scale值的信息，以将其嵌入到生成的 IR 中。我们可以在命令中使用如下参数：

--mean_values
--scale_values
--scale

一个例子：mo --input_model unet.pdmodel --mean_values [123,117,104] --scale 255。

修改输入通道：有时，您的应用程序的输入图像可以是 RGB (BGR) 格式，并且模型在 BGR (RGB) 格式的图像上进行训练，颜色通道顺序相反。在这种情况下，重要的是通过在推理之前恢复颜色通道来预处理输入图像。为了将此预处理步骤嵌入到 IR 中，模型优化器提供了 --reverse_input_channels 命令行参数来修改颜色通道。

这里，我们没有在模型变化的时候进行归一化处理，而是在后续导入图片后进行对应的预处理。

代码如下：

MODEL_DIR_IR = 'model/FP16'
MODEL_DIR_ONNX = 'model'
MODEL_NAME = 'mobileNetV2-lego-minifigures'

print("0 Load the original ONNX model and convert it into IR format.")
os.makedirs(MODEL_DIR_IR, exist_ok=True)
onnx_model_path = Path(MODEL_DIR_ONNX) / '{}.onnx'.format(MODEL_NAME)
ir_model_path = Path(MODEL_DIR_IR) / '{}.xml'.format(MODEL_NAME)
print("- Load ONNX model {}".format(onnx_model_path))
ir_path = Path(ir_model_path).with_suffix(".xml")

# Convert this model into the OpenVINO IR using the Model Optimizer:
mo_command = f"""mo
                 --framework=onnx
                 --input_model "{onnx_model_path}"
                 --input_shape "[1,3,224,224]"
                 --data_type FP16 
                 --output_dir "{MODEL_DIR_IR}"
                 """
mo_command = " ".join(mo_command.split())
print("- Model Optimizer command to convert ONNX to OpenVINO:")
display(Markdown(f"`{mo_command}`"))

# Run Model Optimizer if the IR model file does not exist
if not ir_path.exists():
    print("Exporting ONNX model to IR... This may take a few minutes.")
    ! $mo_command
else:
    print(f"- IR model {ir_path} converted. Already exists.")

我们可以将 ONNX 模型转化为 FP16 或者 FP32 精度的 IR 中间件模型。相关模型分别保存于./model/FP16/与./model/FP32/路径下。

Terminal运行后打印：

0 Load the original ONNX model and convert it into IR format.
- Load ONNX model model\mobileNetV2-lego-minifigures.onnx
- Model Optimizer command to convert ONNX to OpenVINO:
mo --framework=onnx --input_model "model\mobileNetV2-lego-minifigures.onnx" --input_shape "[1,3,224,224]" --data_type FP16 --output_dir "model/FP16"

- IR model model\FP16\mobileNetV2-lego-minifigures.xml converted. Already exists.

3 导入 IR 模型

接下来，我们导入 IR 模型，代码如下：

print("1 Load the IR model.")
ie = Core()
model = ie.read_model(model=ir_path)
compiled_model = ie.compile_model(model=model, device_name="CPU")
input_layer_ir = compiled_model.input(0)
output_layer_ir = compiled_model.output(0)
print("- Input layer info: {}".format(input_layer_ir))
print("- Output layer info: {}".format(output_layer_ir))

Terminal打印：

1 Load the IR model.
- Input layer info: <ConstOutput: names[input.1] shape{1,3,224,224} type: f32>
- Output layer info: <ConstOutput: names[466] shape{1,37} type: f32>

需要注意，这个案例包含了37个label：[‘SPIDER-MAN’, ‘VENOM’, ‘AUNT MAY’, ‘GHOST SPIDER’, ‘YODA’, ‘LUKE SKYWALKER’, ‘R2-D2’, ‘MACE WINDU’, ‘GENERAL GRIEVOUS’, ‘KYLO REN’, ‘THE MANDALORIAN’, ‘CARA DUNE’, ‘KLATOOINIAN RAIDER 1’, ‘KLATOOINIAN RAIDER 2’, ‘MYSTERIO’, ‘FIREFIGHTER’, ‘SPIDER-MAN’, ‘HARRY POTTER’, ‘RON WEASLEY’, ‘BLACK WIDOW’, ‘YELENA BELOVA’, ‘TASKMASTER’, ‘CAPTAIN AMERICA’, ‘OUTRIDER 1’, ‘OUTRIDER 2’, ‘OWEN GRADY’, ‘TRACKER TRAQUEUR RASTREADOR’, ‘IRON MAN MK 1’, ‘IRON MAN MK 5’, ‘IRON MAN MK 41’, ‘IRON MAN MK 50’, ‘JANNAH’, ‘HAN SOLO’, ‘DARTH VADER’, ‘ANAKIN SKYWALKER’, ‘EMPEROR PALPATINE’, ‘OBI-WAN KENOBI’]。

4 在一张图像上进行模型推理

首先，我们尝试在一张图像上进行 IR 模型推理，已解释其中的每一个步骤。

导入图片（cv2.imread(BASE_DIR+img_name)）
通过OpenCV对图像进行预训练，包括：
- BGR转RGB（cv2.cvtColor(image, cv2.COLOR_BGR2RGB)）
- 图像长宽的resize，以符合IR模型的输入尺寸（cv2.resize(image, (W, H))）
- 通过mean和std对图像进行归一化（torchvision.transforms.Compose）
- 尺寸转换，最终得以和IR模型的输入尺寸完全一致
模型推理（compiled_model([input_image])[output_layer_ir]）
找到最可能的识别对象结果（np.argmax(boxes)）
与Ground Truth对比

代码如下：

print("First let's test one image with IR model.")
# The base dataset directory
BASE_DIR = './archive/'
img_name = 'test/001.jpg'
df_metadata = pd.read_csv(os.path.join(BASE_DIR, 'metadata.csv'), index_col=0)
df_groundtruth = pd.read_csv(os.path.join(BASE_DIR, 'test.csv'), index_col=0)
df_groundtruth_result = df_groundtruth[df_groundtruth.index == img_name]
df_groundtruth_imageNames = df_groundtruth.index.to_list()
N_CLASSES = df_metadata.shape[0]
print('- Number of classes: ', N_CLASSES)

# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = input_layer_ir.shape
print("2 Load the image, and reshape to the same size as model input.")
# Text detection models expects image in BGR format
image = cv2.imread(BASE_DIR+img_name)
print("- Image original shape: {0}".format(image.shape))
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
print("- First convert from BGR to RGB.")
# Resize image to meet network expected input sizes
image = cv2.resize(image, (W, H))
print("- Second convert image size to {}".format(image.shape))
transform = torchvision.transforms.Compose([
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])
# ToTensor() takes a PIL image (or np.int8 NumPy array) with shape (n_rows, n_cols, n_channels) as input and returns a PyTorch tensor with floats between 0 and 1 and shape (n_channels, n_rows, n_cols)
# Normalize() subtracts the mean and divides by the standard deviation of the floating point values in the range [0, 1].
normalized_img = transform(image)
normalized_img = normalized_img.numpy()
print("- Third, image shape after torchvision normalization is: {}".format(normalized_img.shape))
normalized_img = normalized_img.transpose(1, 2, 0)
print("- image shape is transposed into {}".format(normalized_img.shape))
# Reshape to network input shape
input_image = np.expand_dims(normalized_img.transpose(2, 0, 1), 0)
#plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
print("- image shape finally: {0}".format(input_image.shape))

print("3 Inference.")
# Create inference request
boxes = compiled_model([input_image])[output_layer_ir]
# Remove zero only boxes
#boxes = boxes[~np.all(boxes == 0, axis=1)]
result_index = np.argmax(boxes)
print("- Shape of inference result: {0}".format(boxes.shape))
labels_names = df_metadata['minifigure_name'].tolist()
labels_names_idx = df_metadata.index
label_groundtruth = df_groundtruth_result['class_id'][0]
print("- Final Predict classification result: {0}".format(labels_names_idx[result_index]))
print("- Final Ground Truth classification result: {0}".format(label_groundtruth))

Terminal 打印：

First let's test one image with IR model.
- Number of classes:  37
2 Load the image, and reshape to the same size as model input.
- Image original shape: (512, 512, 3)
- First convert from BGR to RGB.
- Second convert image size to (224, 224, 3)
- Third, image shape after torchvision normalization is: (3, 224, 224)
- image shape is transposed into (224, 224, 3)
- image shape finally: (1, 3, 224, 224)
3 Inference.
- Shape of inference result: (1, 37)
- Final Predict classification result: 32
- Final Ground Truth classification result: 32

5 计算模型的精度以及FPS

接下来，我们将所有测试数据都跑一遍，然后计算模型的精度以及FPS。

代码如下：

import time
# The base dataset directory
BASE_DIR = './archive/'
df_metadata = pd.read_csv(os.path.join(BASE_DIR, 'metadata.csv'), index_col=0)
df_groundtruth = pd.read_csv(os.path.join(BASE_DIR, 'test.csv'), index_col=0)
df_groundtruth_imageNames = df_groundtruth.index.to_list()
labels_names = df_metadata['minifigure_name'].tolist()
labels_names_idx = df_metadata.index
num_iter = len(df_groundtruth_imageNames)
N_CLASSES = df_metadata.shape[0]
# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = input_layer_ir.shape

results = []

st = time.time()
for img_name in df_groundtruth_imageNames:
    # Text detection models expects image in BGR format
    image = cv2.imread(BASE_DIR+img_name)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # Resize image to meet network expected input sizes
    image = cv2.resize(image, (W, H))
    transform = torchvision.transforms.Compose([
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225],
        ),
    ])
    # ToTensor() takes a PIL image (or np.int8 NumPy array) with shape (n_rows, n_cols, n_channels) as input and returns a PyTorch tensor with floats between 0 and 1 and shape (n_channels, n_rows, n_cols)
    # Normalize() subtracts the mean and divides by the standard deviation of the floating point values in the range [0, 1].
    normalized_img = transform(image)
    normalized_img = normalized_img.numpy()
    normalized_img = normalized_img.transpose(1, 2, 0)
    # Reshape to network input shape
    input_image = np.expand_dims(normalized_img.transpose(2, 0, 1), 0)
    # Create inference request
    boxes = compiled_model([input_image])[output_layer_ir]
    # Remove zero only boxes
    result_index = np.argmax(boxes)
    label_ = labels_names_idx[result_index]
    df_groundtruth_result = df_groundtruth[df_groundtruth.index == img_name]
    label_groundtruth = df_groundtruth_result['class_id'][0]
    if label_ != label_groundtruth:
        results.append(1)
    else:
        results.append(0)

et = time.time()
elapsed_time = et - st
print('Execution time: {} seconds for {} images'.format(elapsed_time, num_iter))
print('FPS: {}'.format(int(num_iter/elapsed_time)))
print('Model accuracy: {}'.format(1-sum(results)/len(results)))

6 使用`benchmark_app`测试IR模型性能

为了测量 FP16 IR 模型的推理性能，我们使用 OpenVINO 的 Benchmark Tool。可以在笔记本中运行：！benchmark_app 或 %sx benchmark_app。

注意：为了获得最准确的性能估计，我们建议在关闭其他应用程序后在终端/命令提示符下运行 benchmark_app。运行 benchmark_app --help 以查看所有命令行选项。

!benchmark_app -m $ir_path -d CPU -api async -t 15 -b 1
# t means time, b means batch

Terminal打印：

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device CPU performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read model took 56.00 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'input.1' precision u8, dimensions ([N,C,H,W]): 1 3 224 224
[ INFO ] Model output '466' precision f32, dimensions ([...]): 1 37
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 125.00 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: CPU
[ INFO ]   AVAILABLE_DEVICES  , ['']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 1, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 8)
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'FP16', 'INT8', 'BIN', 'EXPORT_IMPORT']
[ INFO ]   CACHE_DIR  , 
[ INFO ]   NUM_STREAMS  , 4
[ INFO ]   INFERENCE_NUM_THREADS  , 0
[ INFO ]   PERF_COUNT  , False
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 4 infer requests took 0.00 ms
[ WARNING ] No input files were given for input 'input.1'!. This input will be filled with random values!
[ INFO ] Fill input 'input.1' with random values 
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests using 4 streams for CPU, inference only: True, limits: 15000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 7.26 ms
[Step 11/11] Dumping statistics report
Count:          4072 iterations
Duration:       15013.77 ms
Latency:
    Median:     14.21 ms
    AVG:        14.65 ms
    MIN:        9.65 ms
    MAX:        52.41 ms
Throughput: 271.22 FPS

7 OpenVINO PreProcessing API 将预处理步骤集成并保存到 IR 中

对于许多应用程序来说，最小化模型的读取/加载时间也很重要，因此在 ov::runtime::Core::read_model 之后每次应用程序启动时执行预处理步骤的集成可能看起来不方便。在这种情况下，在添加预处理和后处理步骤之后，将新的执行模型存储为中间表示（IR，.xml 格式）会很有用。

7.1 Precision 调整

两个方面：输入的图像由无符号8整数值数组表示；模型接受浮点张量。

# 输入的图像由无符号8整数值数组表示
ppp.input().tensor().set_element_type(Type.u8)
# execution graph preprocessing 模型接受浮点张量
ppp.input().preprocess().convert_element_type(Type.f16)

7.2 Layout 调整

两个方面：输入的图像 Layout 是[1,512,512,3]，也就是[NHWC]；模型的输入尺寸格式是[1,3,224,224]，也就是[NCHW]。所以这里我们需要对两者进行设置。

# 设置输入的图像 Layout 为 'NHWC'
ppp.input().tensor().set_layout(Layout('NHWC'))
# 模型的输入格式为 'NCHW'
ppp.input().model().set_layout(Layout('NCHW'))
# execution graph preprocessing 将'NHWC'转化为'NCHW'
ppp.input().preprocess().convert_layout([0, 3, 1, 2])

7.3 图像高宽尺寸调整

两个方面：在这个案例，我们输入的图像尺寸都是一致的，为[512,512,3]；模型的输入尺寸格式是[1,3,224,224]。这里，我们需要调整图像的高宽尺寸。

# 设置输入的图像的尺寸，按照'NHWC'的顺序写
ppp.input().tensor().set_shape([1,512,512,3])
# 如果我们不确定输入图像的高宽尺寸，可以用set_spatial_dynamic_shape
# ppp.input().tensor().set_spatial_dynamic_shape()
# execution graph preprocessing 将输入图像高宽尺寸进行调整
ppp.input().preprocess().resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224)

7.4 Color 调整

两个方面：我们通过openCV输入的图像都是BGR的；模型的输入图像是RGB格式。

# 设置输入的图像的 color format
ppp.input().tensor().set_color_format(ColorFormat.BGR)
# execution graph preprocessing 将输入图像从BGR格式转化为RGB格式
ppp.input().preprocess().convert_color(ColorFormat.RGB)

代码如下：

from openvino.preprocess import PrePostProcessor, ColorFormat, ResizeAlgorithm
from openvino.runtime import Core, Layout, Type, set_batch
from openvino.runtime.passes import Manager

# ========  Step 0: read original model =========
core = Core()
model = core.read_model(model=onnx_model_path)

# ======== Step 1: Preprocessing ================
ppp = PrePostProcessor(model)
# Declare section of desired application's input format
# 这个指的是输入的图像 format，比如我们通过opencv导入的图像是0-255 像素范围，[height,width,channel]格式，BGR格式。
ppp.input().tensor() \
    .set_element_type(Type.u8) \
    .set_shape([1, 512, 512, 3]) \
    .set_layout(Layout('NHWC')) \
    .set_color_format(ColorFormat.BGR)
# use set_spatial_dynamic_shape if we are not sure the exact size of input image.

# Specify actual model layout
ppp.input().model().set_layout(Layout('NCHW'))

# Explicit preprocessing steps. Layout conversion will be done automatically as last step
ppp.input().preprocess() \
    .convert_element_type(Type.f16) \
    .convert_color(ColorFormat.RGB) \
    .convert_layout([0, 3, 1, 2]) \
    .resize(ResizeAlgorithm.RESIZE_LINEAR, 224, 224) \
    .mean([123.675, 116.28, 103.53]) \
    .scale([58.395, 57.12, 57.375])

# Dump preprocessor
print(f'Dump preprocessor: {ppp}')
model = ppp.build()

# ======== Step 2: Change batch size ================
# In this example we also want to change batch size to increase throughput
set_batch(model, 1)

# ======== Step 3: Save the model ================
pass_manager = Manager()
MODEL_DIR_IR_PP = 'model/FP16PP'
MODEL_NAME = 'mobileNetV2-lego-minifigures'
os.makedirs(MODEL_DIR_IR_PP, exist_ok=True)
irpp_model_path_xml = '{}/{}.xml'.format(MODEL_DIR_IR_PP,MODEL_NAME)
irpp_model_path_bin = '{}/{}.bin'.format(MODEL_DIR_IR_PP,MODEL_NAME)
pass_manager.register_pass(pass_name="Serialize",
                           xml_path=irpp_model_path_xml,
                           bin_path=irpp_model_path_bin)
pass_manager.run_passes(model)

Terminal 打印：

Dump preprocessor: Input "input.1" (color BGR):
User's input tensor: {1,512,512,3}, [N,H,W,C], u8
Model's expected tensor: {1,3,224,224}, [N,C,H,W], f32
Pre-processing steps (6):
    convert type (f16): ({1,512,512,3}, [N,H,W,C], u8, BGR) -> ({1,512,512,3}, [N,H,W,C], f16, BGR)
    convert color (RGB): ({1,512,512,3}, [N,H,W,C], f16, BGR) -> ({1,512,512,3}, [N,H,W,C], f16, RGB)
    convert layout (0,3,1,2): ({1,512,512,3}, [N,H,W,C], f16, RGB) -> ({1,3,512,512}, [N,C,H,W], f16, RGB)
    resize to (224, 224): ({1,3,512,512}, [N,C,H,W], f16, RGB) -> ({1,3,224,224}, [N,C,H,W], f16, RGB)
    mean (123.675,116.28,103.53): ({1,3,224,224}, [N,C,H,W], f16, RGB) -> ({1,3,224,224}, [N,C,H,W], f16, RGB)
    scale (58.395,57.12,57.375): ({1,3,224,224}, [N,C,H,W], f16, RGB) -> ({1,3,224,224}, [N,C,H,W], f16, RGB)
Implicit pre-processing steps (1):
    convert type (f32): ({1,3,224,224}, [N,C,H,W], f16, RGB) -> ({1,3,224,224}, [N,C,H,W], f32, RGB)

8 导入 OpenVINO PreProcessing API 集成后的 IR 模型

print("Load the IR model after OpenVINO PreProcessing API Integration.")
ie = Core()
modelpp = ie.read_model(model=irpp_model_path_xml)
compiled_modelpp = ie.compile_model(model=modelpp, device_name="CPU")
input_layer_irpp = compiled_modelpp.input(0)
output_layer_irpp = compiled_modelpp.output(0)
print("- Input layer info: {}".format(input_layer_irpp))
print("- Output layer info: {}".format(output_layer_irpp))

Terminal打印：

Load the IR model after OpenVINO PreProcessing API Integration.
- Input layer info: <ConstOutput: names[input.1] shape{1,512,512,3} type: u8>
- Output layer info: <ConstOutput: names[466] shape{1,37} type: f32>

9 计算模型的精度以及FPS（execution graph preprocessing）

接下来，我们将所有测试数据都跑一遍，然后计算模型的精度以及FPS。并且我们对比一下用OpenCV的预处理与OpenVINO Execution Graph PreProcessing (PP) 进行预处理的结果。

总体上来说，IR模型在没有加PP之前的benchmark_app值是271.22 FPS，在加了之后是183.83 FPS。但是opencv预处理模块更花时间（CV Execution time: 1.0087604522705078 seconds for 76 images CV，FPS: 75；PP Execution time: 0.7720448970794678 seconds for 76 images，PP FPS: 98）。大概IR模型加了PP之后可以每一帧节约3ms时间。而这个节约的时间也和CPU，或者GPU的性能，以及具体的使用情况有关。（在CPU上两者的区别不明显，但是在GPU上很明显）

import time
# The base dataset directory
BASE_DIR = './archive/'
df_metadata = pd.read_csv(os.path.join(BASE_DIR, 'metadata.csv'), index_col=0)
df_groundtruth = pd.read_csv(os.path.join(BASE_DIR, 'test.csv'), index_col=0)
df_groundtruth_imageNames = df_groundtruth.index.to_list()
labels_names = df_metadata['minifigure_name'].tolist()
labels_names_idx = df_metadata.index
num_iter = len(df_groundtruth_imageNames)
N_CLASSES = df_metadata.shape[0]
# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = input_layer_ir.shape

'''
通过openCV进行预处理
'''
results = []
st = time.time()
for img_name in df_groundtruth_imageNames:
    # Text detection models expects image in BGR format
    image = cv2.imread(BASE_DIR+img_name)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # Resize image to meet network expected input sizes
    image = cv2.resize(image, (W, H))
    transform = torchvision.transforms.Compose([
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(
            mean=[0.485, 0.456, 0.406],
            std=[0.229, 0.224, 0.225],
        ),
    ])
    # ToTensor() takes a PIL image (or np.int8 NumPy array) with shape (n_rows, n_cols, n_channels) as input and returns a PyTorch tensor with floats between 0 and 1 and shape (n_channels, n_rows, n_cols)
    # Normalize() subtracts the mean and divides by the standard deviation of the floating point values in the range [0, 1].
    normalized_img = transform(image)
    normalized_img = normalized_img.numpy()
    normalized_img = normalized_img.transpose(1, 2, 0)
    # Reshape to network input shape
    input_image = np.expand_dims(normalized_img.transpose(2, 0, 1), 0)
    # Create inference request
    boxes = compiled_model([input_image])[output_layer_ir]
    # Remove zero only boxes
    result_index = np.argmax(boxes)
    label_ = labels_names_idx[result_index]
    df_groundtruth_result = df_groundtruth[df_groundtruth.index == img_name]
    label_groundtruth = df_groundtruth_result['class_id'][0]
    if label_ != label_groundtruth:
        results.append(1)
    else:
        results.append(0)

et = time.time()
elapsed_time = et - st
print('CV Execution time: {} seconds for {} images'.format(elapsed_time, num_iter))
print('CV FPS: {}'.format(int(num_iter/elapsed_time)))
print('CV Model accuracy: {}'.format(1-sum(results)/len(results)))


'''
通过 openvino execution graph preprocessing 进行预处理
'''

results = []
st = time.time()
for img_name in df_groundtruth_imageNames:
    # Text detection models expects image in BGR format
    image = cv2.imread(BASE_DIR+img_name)
    image = np.expand_dims(image, 0)
    # Create inference request
    boxes = compiled_modelpp([image])[output_layer_irpp]
    # Remove zero only boxes
    result_index = np.argmax(boxes)
    label_ = labels_names_idx[result_index]
    df_groundtruth_result = df_groundtruth[df_groundtruth.index == img_name]
    label_groundtruth = df_groundtruth_result['class_id'][0]
    if label_ != label_groundtruth:
        results.append(1)
    else:
        results.append(0)

et = time.time()
elapsed_time = et - st
print('PP Execution time: {} seconds for {} images'.format(elapsed_time, num_iter))
print('PP FPS: {}'.format(int(num_iter/elapsed_time)))
print('PP Model accuracy: {}'.format(1-sum(results)/len(results)))

Terminal返回：

CV Execution time: 0.9832742214202881 seconds for 76 images
CV FPS: 77
CV Model accuracy: 0.8289473684210527
PP Execution time: 0.882225751876831 seconds for 76 images
PP FPS: 86
PP Model accuracy: 0.8289473684210527

10 使用`benchmark_app`测试OpenVINO PreProcessing API 集成后的 IR 模型性能

!benchmark_app -m $irpp_model_path_xml -d CPU -api async -t 15 -b 1

Terminal返回：

[Step 1/11] Parsing and validating input arguments
[ WARNING ]  -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README. 
[Step 2/11] Loading OpenVINO
[ WARNING ] PerformanceMode was not explicitly specified in command line. Device CPU performance hint will be set to THROUGHPUT.
[ INFO ] OpenVINO:
         API version............. 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ INFO ] Device info
         CPU
         openvino_intel_cpu_plugin version 2022.1
         Build................... 2022.1.0-7019-cdb9bec7210-releases/2022/1

[Step 3/11] Setting device configuration
[ WARNING ] -nstreams default value is determined automatically for CPU device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 4/11] Reading network files
[ INFO ] Read model took 68.01 ms
[Step 5/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model input 'input.1' precision u8, dimensions ([N,H,W,C]): 1 512 512 3
[ INFO ] Model output '466' precision f32, dimensions ([...]): 1 37
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 168.00 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] DEVICE: CPU
[ INFO ]   AVAILABLE_DEVICES  , ['']
[ INFO ]   RANGE_FOR_ASYNC_INFER_REQUESTS  , (1, 1, 1)
[ INFO ]   RANGE_FOR_STREAMS  , (1, 8)
[ INFO ]   FULL_DEVICE_NAME  , Intel(R) Core(TM) i5-10210U CPU @ 1.60GHz
[ INFO ]   OPTIMIZATION_CAPABILITIES  , ['FP32', 'FP16', 'INT8', 'BIN', 'EXPORT_IMPORT']
[ INFO ]   CACHE_DIR  , 
[ INFO ]   NUM_STREAMS  , 4
[ INFO ]   INFERENCE_NUM_THREADS  , 0
[ INFO ]   PERF_COUNT  , False
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS  , 0
[Step 9/11] Creating infer requests and preparing input data
[ INFO ] Create 4 infer requests took 0.98 ms
[ WARNING ] No input files were given for input 'input.1'!. This input will be filled with random values!
[ INFO ] Fill input 'input.1' with random values 
[Step 10/11] Measuring performance (Start inference asynchronously, 4 inference requests using 4 streams for CPU, inference only: True, limits: 15000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 9.71 ms
[Step 11/11] Dumping statistics report
Count:          2764 iterations
Duration:       15035.84 ms
Latency:
    Median:     21.30 ms
    AVG:        21.64 ms
    MIN:        16.49 ms
    MAX:        35.65 ms
Throughput: 183.83 FPS

破浪会有时

关注

2
点赞
踩
6

收藏

觉得还不错? 一键收藏
打赏
2
评论
openvino系列 17. OpenVINO Preprocessing API 案例，以及与OpenCV的预处理对比

介绍OpenVINO Preprocessing API，并与OpenCV的预处理结果做对比。- 读取 ONNX 迁移学习模型- 将 ONNX 模型转化为 IR 中间件- 通过 OpenCV 导入图片，并进行预处理，最终模型推理。计算模型的FPS以及精度；- 通过`benchmark_app`指令查看此IR模型的性能；- 通过 OpenVINO Preprocessing API 将预处理步骤集成并保存到 IR 中；- 对比 OpenVINO 预处理以及 OpenCV 预处理结果，包括速度和精度
复制链接

扫一扫