OpenVino入门(二)
一.OpenVino简介
1.1OpenVino是什么
当模型训练结束后,上线部署时,就会遇到各种问题,比如,模型性能是否满足线上要求,模型如何嵌入到原有工程系统,推理线程的并发路数是否满足,这些问题决定着投入产出比。只有深入且准确的理解深度学习框架,才能更好的完成这些任务,满足上线要求。实际情况是,新的算法模型和所用框架在不停的变化,这个时候恨不得工程师什么框架都熟练掌握,令人失望的是,这种人才目前是稀缺的。
OpenVINO是一个Pipeline工具集,同时可以兼容各种开源框架训练好的模型,拥有算法模型上线部署的各种能力,只要掌握了该工具,可以轻松的将预训练模型在Intel的CPU上快速部署起来。
1.2 OpenVino的网络加速原理
为什么需要网络加速/压缩?
大家熟知的resnet,densenet均属于巨无霸类型的网络,在延迟,大小均对用户不友好。试想:你下载了一个手势识别的app,里面还带上了100m大小的resnet,这不是很好的体验。
为了部署深度学习模型,我们可能会在CPU/GPU设备上部署模型。所幸,英伟达与英特尔都提供了官方的网络加速工具。核弹厂对应Tensor RT(GPU),牙膏厂对应openvino(CPU)。
1.2.1Linear Operations Fusing

1.2.2 数据精度校准(Precision Calibration)
我们训练的网络通常是FP32精度的网络,一旦网络训练完成,在部署推理的过程中由于不需要反向传播,完全可以适当降低数据精度,比如降为FP16或INT8的精度。更低的数据精度将会使得内存占用和延迟更低,模型体积更小。
而什么是Calibration?对于模型中的若干网络层,我们可以逐个的降低其精度,同时准备一个验证集,再划定一条baseline,但网络的性能降低到baseline时,我们停止降低精度。当然也可以将所有网络层的精度降低,但与此同时模型的性能也会降低。
1.2.3 补充
openvino的网络加速,除了减小模型,还有对硬件指令的优化使得硬件效率更高
1.3 开发流程
OpenVINO工具包(ToolKit)主要包括两个核心组件:
- 模型优化器(Model Optimizer)
- 推理引擎(Inference Engine)

1.3.1 模型优化器(Model Optimizer)
模型优化器(Model Optimizer)将给定的模型转化为标准的 Intermediate Representation (IR) ,并对模型优化。
模型优化器支持的深度学习框架:
- ONNX
- TensorFlow
- Caffe
- MXNet
- Kaldi
1.3.2 推断引擎(Inference Engine)
推断引擎(Inference Engine)支持硬件指令集层面的深度学习模型加速运行,同时对传统的OpenCV图像处理库也进行了指令集优化,有显著的性能与速度提升。
支持的硬件设备:
- CPU
- GPU
- FPGA
- VPU
1.3.2 推断引擎开发代码流程

- 新建InferenceEngine::Core core(处理器的插件库)InferenceEngine的作用
- 读取模型(网络结构和权重),由xxx.bin与xxx.xml组成
- 配置输入和输出参数(似乎这里可以不做,一切继承模型的配置)
- 装载模型,将模型依靠InferenceEngine::Core::LoadNetwork()载入到硬件上
- 建立推理请求CreateInferRequest()
- 准备输入数据
- 推理
- 结果处理
二.以Super Resolution C++ Demo展示推断引擎
demo:[官方文档]
demo的使用方法可以详见上篇博客的2.4
2.1Super Resolution C++源代码
// Copyright (C) 2018-2019 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//
/**
* @brief The entry point for inference engine Super Resolution demo application
* @file super_resolution_demo/main.cpp
* @example super_resolution_demo/main.cpp
*/
#include <algorithm>
#include <vector>
#include <string>
#include <memory>
#include <inference_engine.hpp>
#include <samples/slog.hpp>
#include <samples/args_helper.hpp>
#include <samples/ocv_common.hpp>
#include "super_resolution_demo.h"
using namespace InferenceEngine;
bool ParseAndCheckCommandLine(int argc, char *argv[]) {
// ---------------------------Parsing and validation of input args--------------------------------------
slog::info << "Parsing input parameters" << slog::endl;
gflags::ParseCommandLineNonHelpFlags(&argc, &argv, true);
if (FLAGS_h) {
showUsage();
showAvailableDevices();
return false;
}
if (FLAGS_i.empty()) {
throw std::logic_error("Parameter -i is not set");
}
if (FLAGS_m.empty()) {
throw std::logic_error("Parameter -m is not set");
}
return true;
}
int main(int argc, char *argv[]) {
try {
slog::info << "InferenceEngine: " << printable(*GetInferenceEngineVersion()) << slog::endl;
// ------------------------------ Parsing and validation of input args ---------------------------------
if (!ParseAndCheckCommandLine(argc, argv)) {
return 0;
}
/** This vector stores paths to the processed images **/
std::vector<std::string> imageNames;
parseInputFilesArguments(imageNames);
if (imageNames.empty()) throw std::logic_error("No suitable images were found");
// -----------------------------------------------------------------------------------------------------
// --------------------------- 1. Load inference engine -------------------------------------
slog::info << "Loading Inference Engine" << slog::endl;
Core ie;
/** Printing device version **/
slog::info << "Device info: " << slog::endl;
slog::info << printable(ie.GetVersions(FLAGS_d)) << slog::endl;
if (!FLAGS_l.empty()) {
// CPU(MKLDNN) extensions are loaded as a shared library and passed as a pointer to base extension
IExtensionPtr extension_ptr = make_so_pointer<IExtension>(FLAGS_l);
ie.AddExtension(extension_ptr, "CPU");
slog::info << "CPU Extension loaded: " << FLAGS_l << slog::endl;
}
if (!FLAGS_c.empty()) {
// clDNN Extensions are loaded from an .xml description and OpenCL kernel files
ie.SetConfig({{PluginConfigParams::KEY_CONFIG_FILE, FLAGS_c}}, "GPU");
slog::info << "GPU Extension loaded: " << FLAGS_c << slog::endl;
}
// -----------------------------------------------------------------------------------------------------
// --------------------------- 2. Read IR Generated by ModelOptimizer (.xml and .bin files) ------------
slog::info << "Loading network files" << slog::endl;
/** Read network model **/
auto network = ie.ReadNetwork(FLAGS_m);
// -----------------------------------------------------------------------------------------------------
// --------------------------- 3. Configure input & output ---------------------------------------------
// --------------------------- Prepare input blobs -----------------------------------------------------
slog::info << "Preparing input blobs" << slog::endl;
/** Taking information about all topology inputs **/
ICNNNetwork::InputShapes inputShapes(network.getInputShapes());
if (inputShapes.size() != 1 && inputShapes.size() != 2)
throw std::logic_error("The demo supports topologies with 1 or 2 inputs only");
std::string lrInputBlobName = inputShapes.begin()->first;
SizeVector lrShape = inputShapes[lrInputBlobName];
if (lrShape.size() != 4) {
throw std::logic_error("Number of dimensions for an input must be 4");
}
// A model like single-image-super-resolution-???? may take bicubic interpolation of the input image as the
// second input
std::string bicInputBlobName;
if (inputShapes.size() == 2) {
bicInputBlobName = (++inputShapes.begin())->first;
SizeVector bicShape = inputShapes[bicInputBlobName];
if (bicShape.size() != 4) {
throw std::logic_error("Number of dimensions for both inputs must be 4");
}
if (lrShape[2] >= bicShape[2] && lrShape[3] >= bicShape[3]) {
lrInputBlobName.swap(bicInputBlobName);
lrShape.swap(bicShape);
} else if (!(lrShape[2] <= bicShape[2] && lrShape[3] <= bicShape[3])) {
throw std::logic_error("Each spatial dimension of one input must surpass or be equal to a spatial"
"dimension of another input");
}
}
/** Collect images**/
std::vector<cv::Mat> inputImages;
for (const auto &i : imageNames) {
/** Get size of low resolution input **/
int w = lrShape[3];
int h = lrShape[2];
int c = lrShape[1];
cv::Mat img = cv::imread(i, c == 1 ? cv::IMREAD_GRAYSCALE : cv::IMREAD_COLOR);
if (img.empty()) {
slog::warn << "Image " + i + " cannot be read!" << slog::endl;
continue;
}
if (c != img.channels()) {
slog::warn << "Number of channels of the image " << i << " is not equal to " << c << ". Skip it\n";
continue;
}
if (w != img.cols || h != img.rows) {
slog::warn << "Size of the image " << i << " is not equal to " << w << "x" << h << ". Resize it\n";
cv::resize(img, img, {w, h});
}
inputImages.push_back(img);
}
if (inputImages.empty()) throw std::logic_error("Valid input images were not found!");
/** Setting batch size using image count **/
inputShapes[lrInputBlobName][0] = inputImages.size();
if (!bicInputBlobName.empty()) {
inputShapes[bicInputBlobName][0] = inputImages.size();
}
network.reshape(inputShapes);
slog::info << "Batch size is " << std::to_string(network.getBatchSize()) << slog::endl;
// ------------------------------ Prepare output blobs -------------------------------------------------
slog::info << "Preparing output blobs" << slog::endl;
OutputsDataMap outputInfo(network.getOutputsInfo());
// BlobMap outputBlobs;
std::string firstOutputName;
for (auto &item : outputInfo) {
if (firstOutputName.empty()) {
firstOutputName = item.first;
}
DataPtr outputData = item.second;
if (!outputData) {
throw std::logic_error("output data pointer is not valid");
}
item.second->setPrecision(Precision::FP32);
}
// -----------------------------------------------------------------------------------------------------
// --------------------------- 4. Loading model to the device ------------------------------------------
slog::info << "Loading model to the device" << slog::endl;
ExecutableNetwork executableNetwork = ie.LoadNetwork(network, FLAGS_d);
// -----------------------------------------------------------------------------------------------------
// --------------------------- 5. Create infer request -------------------------------------------------
slog::info << "Create infer request" << slog::endl;
InferRequest inferRequest = executableNetwork.CreateInferRequest();
// -----------------------------------------------------------------------------------------------------
// --------------------------- 6. Prepare input --------------------------------------------------------
Blob::Ptr lrInputBlob = inferRequest.GetBlob(lrInputBlobName);
for (size_t i = 0; i < inputImages.size(); ++i) {
cv::Mat img = inputImages[i];
matU8ToBlob<float_t>(img, lrInputBlob, i);
if (!bicInputBlobName.empty()) {
Blob::Ptr bicInputBlob = inferRequest.GetBlob(bicInputBlobName);
int w = bicInputBlob->getTensorDesc().getDims()[3];
int h = bicInputBlob->getTensorDesc().getDims()[2];
cv::Mat resized;
cv::resize(img, resized, cv::Size(w, h), 0, 0, cv::INTER_CUBIC);
matU8ToBlob<float_t>(resized, bicInputBlob, i);
}
}
// -----------------------------------------------------------------------------------------------------
// --------------------------- 7. Do inference ---------------------------------------------------------
std::cout << "To close the application, press 'CTRL+C' here";
if (FLAGS_show) {
std::cout << " or switch to the output window and press any key";
}
std::cout << std::endl;
slog::info << "Start inference" << slog::endl;
inferRequest.Infer();
// -----------------------------------------------------------------------------------------------------
// --------------------------- 8. Process output -------------------------------------------------------
const Blob::Ptr outputBlob = inferRequest.GetBlob(firstOutputName);
LockedMemory<const void> outputBlobMapped = as<MemoryBlob>(outputBlob)->rmap();
const auto outputData = outputBlobMapped.as<float*>();
size_t numOfImages = outputBlob->getTensorDesc().getDims()[0];
size_t numOfChannels = outputBlob->getTensorDesc().getDims()[1];
size_t h = outputBlob->getTensorDesc().getDims()[2];
size_t w = outputBlob->getTensorDesc().getDims()[3];
size_t nunOfPixels = w * h;
slog::info << "Output size [N,C,H,W]: " << numOfImages << ", " << numOfChannels << ", " << h << ", " << w << slog::endl;
for (size_t i = 0; i < numOfImages; ++i) {
std::vector<cv::Mat> imgPlanes;
if (numOfChannels == 3) {
imgPlanes = std::vector<cv::Mat>{
cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels])),
cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels + nunOfPixels])),
cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels + nunOfPixels * 2]))};
} else {
imgPlanes = std::vector<cv::Mat>{cv::Mat(h, w, CV_32FC1, &(outputData[i * nunOfPixels * numOfChannels]))};
// Post-processing for text-image-super-resolution models
cv::threshold(imgPlanes[0], imgPlanes[0], 0.5f, 1.0f, cv::THRESH_BINARY);
};
for (auto & img : imgPlanes)
img.convertTo(img, CV_8UC1, 255);
cv::Mat resultImg;
cv::merge(imgPlanes, resultImg);
if (FLAGS_show) {
cv::imshow("result", resultImg);
cv::waitKey();
}
std::string outImgName = std::string("sr_" + std::to_string(i + 1) + ".png");
cv::imwrite(outImgName, resultImg);
}
// -----------------------------------------------------------------------------------------------------
}
catch (const std::exception &error) {
slog::err << error.what() << slog::endl;
return 1;
}
catch (...) {
slog::err << "Unknown/internal exception happened" << slog::endl;
return 1;
}
slog::info << "Execution successful" << slog::endl;
slog::info << slog::endl << "This demo is an API example, for any performance measurements "
"please use the dedicated benchmark_app tool from the openVINO toolkit" << slog::endl;
return 0;
}
2.2不同预训练模型的效果
官方提供了三种预训练:
single-image-super-resolution-1032, which is the model that
performs super resolution 4x upscale on a 270x480 image(它是在270x480图像上执行4倍超高分辨率的模型)single-image-super-resolution-1033, which is the model that
performs super resolution 3x upscale on a 360x640 image(该模型可在360x640图像上执行3倍超高分辨率的超高分辨率)text-image-super-resolution-0001, which is the model that performs super resolution 3x upscale on a 360x640 image(该模型可在360x640图像上执行3倍超高分辨率的超高分辨率)2.2.1 1-o.jpg

1.single-image-super-resolution-1032, which is the model that
performs super resolution 4x upscale on a 270x480 image

整体对比:
细节对比:
2.single-image-super-resolution-1033, which is the model that
performs super resolution 3x upscale on a 360x640 image

整体对比:
细节对比:

3.text-image-super-resolution-0001, which is the model that performs super resolution 3x upscale on a 360x640 image

三个模型对比:

2.2.2 2-0.bmp

1.single-image-super-resolution-1032, which is the model that
performs super resolution 4x upscale on a 270x480 image

整体对比:

细节对比:

2.single-image-super-resolution-1033, which is the model that
performs super resolution 3x upscale on a 360x640 image

整体对比:

细节对比:

1033和1032对比:

3.text-image-super-resolution-0001, which is the model that performs super resolution 3x upscale on a 360x640 image

2.2.3 1-bmp.bmp

1033:

png和bmp对比:

2.2.4 3.bmp

1033:

细节对比:

三.以Object Detection C++ Demo演示模型优化器(Model Optimizer)
demo:[官方文档]

3.1模型转换流程

yolo具体操作就看参考博客或者官方文档吧
参考博客:[1][2][yolov4]
openvino官方文档:[官方教你yolov1-v3转模型]

3.2不同模型配合openvino效果
3.2.1 yolov3

3.2.2 yolov4

3.2.3 SSD预训练模型
找了很多ssd转换模型的方法,没找到,结果openvino提供的预训练模型就是ssd
参考博客:[1]
官方预训练模型:person-detection-retail-0013

3.2.4 yolov5 openvino暂时无法直接支持
四.Openvino程序移植
和普通c++程序可以将exe和所需dll打包,直接放入他人电脑中直接运行不同的是。openvino需要一定的环境,但不需要所有的环境。
详情可见:他人博客
一般是缺什么dll,去找到复制粘贴就好啦
如果出现:plugins.xml:1:0: File was not found

把这里的所有东西带上

一个超分辨率的程序携带的所有东西
这个文件夹,去哪个电脑都能跑
文件夹链接: 提取码: qwww
&spm=1001.2101.3001.5002&articleId=112169332&d=1&t=3&u=2ebd4c032a14474db8f113e60f84b778)
4575

被折叠的 条评论
为什么被折叠?



