openmp官方源码_MNN推理过程源码分析笔记(一)主流程

99292bfa939bbd5a5564781c6fdb23cf.png

在正式开始推理代码分析之前, 回顾下 MNN整体结构

84ae017c493ae23102976d118d9d3e15.png

推理分为三个大部分

  • Engine
  • Backends
  • Runtime Optimize

那么问题来了,从哪里开始,怎么入手呢? 我的心得是源码分析不能直接一头扎进去, 陷入迷茫.

应该遵循两个原则

  1. 从外到内 : 外部sdk接口开始,顺藤摸瓜分析.
  2. 从整体到局部: 先整体上了解这个东西,可以先阅读官方资料知道整体框架,以及哪些模块,分别起到什么作用.

建议按照之前使用方法,把MNN自带的Android demo编译安装运行,并且简单过一遍调用流程, 这里采用从外到内的方法着手分析.

Android上用MNN SDK做推理

以下内容参考自MNN参考文档

MNN中有两个概念:解释器Interpreter和会话Session

解释器 负责加载模型, 持有模型数据;

会话是由解释器创建的,它是推理数据的持有者。多个推论可以共享相同的模型数据,也就是说,多个会话可以共享一个解释器

Session

创建会话得先有解释器, 解释器就是把模型文件加载进来的, 自然就有如下API:

根据指定MNN模型文件,创建解释器Interpreter

 /**
  * @brief create net from file.
  * @param file  given file.
  * @return created net if success, NULL otherwise.
  */
 static Interpreter* createFromFile(const char* file);

解释器提供了创建会话的方法Interpreter::createSession, 通过传递一个config文件配置这个会话.

 /**
  * @brief create session with schedule config. created session will be managed in net.
  * @param config session schedule config.
  * @return created session if success, NULL otherwise.
  */
 Session* createSession(const ScheduleConfig& config);

会话配置Scheduling configuration

 /** session schedule config */
 struct ScheduleConfig {
    
     /** which tensor should be kept */
     std::vector<std::string> saveTensors;
     /** forward type,   CPU 或者GPU */
     MNNForwardType type = MNN_FORWARD_CPU;
     /** number of threads in parallel  线程数 */
     int numThread = 4;
 ​
     /** subpath to run */
     struct Path {
    
         std::vector<std::string> inputs;
         std::vector<std::string> outputs;
 ​
         enum Mode {
    
             /**
              * Op Mode
              * - inputs means the source op, can NOT be empty.
              * - outputs means the sink op, can be empty.
              * The path will start from source op, then flow when encounter the sink op.
              * The sink op will not be compute in this path.
              */
             Op = 0,
 ​
             /**
              * Tensor Mode (NOT supported yet)
              * - inputs means the inputs tensors, can NOT be empty.
              * - outputs means the outputs tensors, can NOT be empty.
              * It will find the pipeline that compute outputs from inputs.
              */
             Tensor = 1
         };
 ​
         /** running mode */
         Mode mode = Op;
     };
     Path path;
 ​
     /** backup backend used to create execution when desinated backend do NOT support any op */
     MNNForwardType backupType = MNN_FORWARD_CPU;
 ​
     /** extra backend config */
     BackendConfig* backendConfig = nullptr;
 };

在推理中,主后端计算是由type指定的,缺省值是CPU。由backupType指定的备用后端,当主选择后端不支持模型中的操作符时启用备用后端。

推理路径是指从输入到输出计算过程中涉及到的operator。如果没有指定,它将根据模型结构自动识别。为了节省内存,MNN复用了tensor的内存(除了output tensor)。因此如果你需要保存中间tensor的结果,必须传入中间tensor, 放到saveTensors作为参数, 来避免内存复用。

在进行推断时,可以使用numThread修改线程的数量。但是线程的实际数量取决于部署环境:

•iOS,使用GCD,忽略配置;

•当MNN_USE_THREAD_POOL被启用时,线程的数量取决于第一次配置的线程数量;

•OpenMP,线程数全局设置,实际线程数取决于上次配置的线程数;

此外,可以通过backendConfig设置后端参数。详见下文。

Backend Configuration

backend configuration的定义:

 struct BackendConfig {
    
     enum MemoryMode {
    
         Memory_Normal = 0,
         Memory_High,
         Memory_Low
     };
     
     MemoryMode memory = Memory_Normal;
     
     enum PowerMode {
    
         Power_Normal = 0,
         Power_High,
         Power_Low
     };
     
     PowerMode power = Power_Normal;
     
     enum PrecisionMode {
    
         Precision_Normal = 0,
         Precision_High,
         Precision_Low
     };
     
     PrecisionMode precision = Precision_Normal;
     
     /** user defined context */
     void* sharedContext = nullptr;
 };

MemoryMode, PowerMode和 PrecisionMode 分别是指 内存占用,功耗,精度配置,

Input Data

获取Input Tensor

两个接口, getSessionInput是获取输入的tensor. 如果模型是有多个input tensor时用第二个接口getSessionInputAll

 /**
  * @brief get input tensor for given name.
  * @param session   given session.
  * @param name      given name. if NULL, return first input.
  * @return tensor if found, NULL otherwise.
  */
 Tensor* getSessionInput(const Session* session, const char* name);
 ​
 /**
  * @brief get all output tensors.
  * @param session   given session.
  * @return all output tensors mapped with name.
  */
 const std::map<std::string, Tensor*>& getSessionInputAll(const Session* session) const;

Fill Data

 auto inputTensor = interpreter->getSessionInput(session, NULL);
 inputTensor->host<float>()[0] = 1.f;

Backend为cpu时, 直接复制到Input tensot的host属性即可.

Copy Data

对于非cpu的backend,参考如下实例

 auto inputTensor = interpreter->getSessionInput(session, NULL);
 auto nchwTensor = new Tensor(inputTensor, Tensor::CAFFE);
 // nchwTensor-host<float>()[x] = ...
 inputTensor->copyFromHostTensor(nchwTensor);
 delete nchwTensor;

Image Processing

 struct Config
 {
    
     Filter filterType = NEAREST;
     ImageFormat sourceFormat = RGBA;
     ImageFormat destFormat = RGBA;
     
     //Only valid if the dest type is float
     float mean[4] = {
    0.0f,0.0f,0.0f, 0.0f};
     float normal[4] = {
    1.0f, 1.0f, 1.0f, 1.0f};
 };

CV::ImageProcess::Config

  • Specify input and output formats by sourceFormat and destFormat, currently supports RGBARGBBGRGRAYBGRAYUV_NV21
  • Specify the type of interpolation by filterType, currently supports NEAREST, BILINEAR and BICUBIC
  • Specify the mean normalization by mean and normal, but the setting is ignored when the data type is not a floating point type

Run Session

调用解释器Interpreter的runSession方法

 /**
  * @brief run session.
  * @param session   given session.
  * @return result of running.
  */
 ErrorCode runSession(Session* session) const;

Run Session with Callbacks

 typedef std::function<bool(const std::vector<Tensor*>&, 
                            const std::string& /*opName*/)> TensorCallBack;
 ​
 /*
  * @brief run session.
  * @param session   given session.
  * @param before    callback before each op. return true to run the op; return false to skip the op.
  * @param after     callback after each op. return true to continue running; return false to interrupt the session.
  * @param sync      synchronously wait for finish of execution or not.
  * @return result of running.
  */
 ErrorCode runSessionWithCallBack(const Session* session, 
                                  const TensorCallBack& before, 
                                  const TensorCallBack& end,
                                  bool sync = false) const;

Compared to runSession, runSessionWithCallback provides additional:

  • Callbacks before each op, which could be used to skip the execution;
  • Callback after each op, which could be used to interrupt the inference;
  • Synchronization option, defaults off; when enabled, all backends will wait for inference to complete, ie the function time cost is equal to the inference time cost;

Get Output

Get Output Tensor

 /**
  * @brief get output tensor for given name.
  * @param session   given session.
  * @param name      given name. if NULL, return first output.
  * @return tensor if found, NULL otherwise.
  */
 Tensor* getSessionOutput(const Session* session, const char* name);
 ​
 /**
  * @brief get all output tensors.
  * @param session   given session.
  * @return all output tensors mapped with name.
  */
 const std::map<std::string, Tensor*>& getSessionOutputAll(const Session* session) const;

Interpreter provides two ways to get the output Tensor: getSessionOutput for getting a single output tensor and getSessionOutputAll for getting the output tensor map.

When there is only one output tensor, you can pass NULL to get the tensor when calling getSessionOutput.

Read Data

 auto outputTensor = interpreter->getSessionOutput(session, NULL);
 auto score = outputTensor->host<float>()[0];
 auto index = outputTensor->host<float>()[1];
 // ...

The simplest way to read data is to read from host of Tensor directly, but this usage is limited to the CPU backend, other backends need to read the data through deviceid. On the other hand, users need to handle the differences between NC4HW4 and NHWC data layouts.

For non-CPU backends, or users who are not familiar with data layout, copy data interfaces should be used.

Copy Data

NCHW examp

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值