openvino系列 5. 物体识别基本案例

openvino系列 5. 物体识别基本案例

这里介绍一个基本的物体识别的例子,其实不管是图像分割还是物体检测,都基本遵循下面三个步骤:

  • 首先,我们需要读取模型(ie.read_model)并且编译(ie.compile_model);
  • 第二步,我们读取图片,并且reshape其大小以符合模型的输入;
  • 第三部,模型推理(compiled_model([input_image])[output_layer_ir])。得到的结果的尺寸和模型的输出尺寸相符。

环境描述:

  • 本案例运行环境:Win10
  • IDE:VSCode
  • openvino版本:2022.1
  • 代码链接2-basic-segmentation-detection-example


1 物体识别

这里介绍如何使用OpenVINO进行物体识别。

我们使用 Open Model Zoo 中的 horizontal-text-detection-0001 模型。它检测图像中的文字并返回[100, 5]数组。 每个检测到的文本框都以[x_min, y_min, x_max, y_max, conf]的格式存储,其中
(x_min, y_min) 是检测到的文字左上角坐标,(x_max, y_max) 是右下角坐标,conf 是预测类的置信度。

  • 首先,我们需要读取模型(ie.read_model)并且编译(ie.compile_model);
  • 第二步,我们读取图片,并且reshape其大小以符合模型的输入;
  • 第三部,模型推理(compiled_model([input_image])[output_layer_ir])。得到的结果的尺寸和模型的输出尺寸相符。

代码如下:

import cv2
import matplotlib.pyplot as plt
import numpy as np
from openvino.runtime import Core

print("1 Load the model.")
ie = Core()
model = ie.read_model(model="model/horizontal-text-detection-0001.xml")
compiled_model = ie.compile_model(model=model, device_name="CPU")
input_layer_ir = compiled_model.input(0)
output_layer_ir = compiled_model.output("boxes")
print("- Input layer info: {}".format(input_layer_ir))
print("- Output layer info: {}".format(output_layer_ir))
print("2 Load the image, and reshape to the same size as model input.")
# Text detection models expects image in BGR format
image = cv2.imread("data/intel_rnb.jpg")
print("- Image original shape: {0}".format(image.shape))
# N,C,H,W = batch size, number of channels, height, width
N, C, H, W = input_layer_ir.shape
# Resize image to meet network expected input sizes
resized_image = cv2.resize(image, (W, H))
# Reshape to network input shape
input_image = np.expand_dims(resized_image.transpose(2, 0, 1), 0)
#plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
print("- Image size reshape into: {0}".format(input_image.shape))
print("3 Inference.")
# Create inference request
boxes = compiled_model([input_image])[output_layer_ir]
# Remove zero only boxes
boxes = boxes[~np.all(boxes == 0, axis=1)]
print("- Shape of inference result: {0}".format(boxes.shape))

Terminal打印如下:

1 Load the model.
- Input layer info: <ConstOutput: names[image] shape{1,3,704,704} type: f32>
- Output layer info: <ConstOutput: names[boxes] shape{..100,5} type: f32>
2 Load the image, and reshape to the same size as model input.
- Image original shape: (517, 690, 3)
- Image size reshape into: (1, 3, 704, 704)
3 Inference.
- Shape of inference result: (6, 5)

最后,我们可以可视化物体识别之后的效果。代码如下:

def convert_result_to_image(bgr_image, resized_image, boxes, threshold=0.3, conf_labels=True):
    """
    For each detection, the description has the format: [x_min, y_min, x_max, y_max, conf]
    Image passed here is in BGR format with changed width and height. To display it in colors expected by matplotlib we use cvtColor function

    :param bgr_image: original image loaded by cv2.imread.
    :param resized_image: image after resized. Its shape is 
    :param remove_holes: If True, remove holes in the segmentation result.
    :return: An RGB image where each pixel is an int8 value according to colormap.
    """
    # Define colors for boxes and descriptions
    colors = {"red": (255, 0, 0), "green": (0, 255, 0)}
    # Fetch image shapes to calculate ratio
    (real_y, real_x), (resized_y, resized_x) = bgr_image.shape[:2], resized_image.shape[:2]
    ratio_x, ratio_y = real_x / resized_x, real_y / resized_y
    # Convert base image from bgr to rgb format
    rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB)
    # Iterate through non-zero boxes
    for box in boxes:
        # Pick confidence factor from last place in array
        conf = box[-1]
        if conf > threshold:
            # Convert float to int and multiply corner position of each box by x and y ratio
            # In case that bounding box is found at the top of the image, 
            # we position upper box bar little lower to make it visible on image 
            (x_min, y_min, x_max, y_max) = [
                int(max(corner_position * ratio_y, 10)) if idx % 2 
                else int(corner_position * ratio_x)
                for idx, corner_position in enumerate(box[:-1])
            ]
            # Draw box based on position, parameters in rectangle function are: image, start_point, end_point, color, thickness
            rgb_image = cv2.rectangle(rgb_image, (x_min, y_min), (x_max, y_max), colors["green"], 3)

            # Add text to image based on position and confidence
            # Parameters in text function are: image, text, bottom-left_corner_textfield, font, font_scale, color, thickness, line_type
            if conf_labels:
                rgb_image = cv2.putText(
                    rgb_image,
                    f"{conf:.2f}",
                    (x_min, y_min - 10),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.8,
                    colors["red"],
                    1,
                    cv2.LINE_AA,
                )
    return rgb_image

plt.figure(figsize=(10, 6))
plt.axis("off")
plt.imshow(convert_result_to_image(image, resized_image, boxes, conf_labels=False))

在这里插入图片描述

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

破浪会有时

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值