MNN源码阅读之推理流程（一）——Interpreter

最新推荐文章于 2024-03-27 11:43:49 发布

弹与征鸿

最新推荐文章于 2024-03-27 11:43:49 发布

阅读量1.9k

点赞数 1

分类专栏： # MNN源码笔记文章标签： MNN 源码 Interpreter

本文链接：https://blog.csdn.net/u013524303/article/details/116233811

版权

MNN源码笔记专栏收录该内容

5 篇文章 2 订阅

订阅专栏

第一次阅读整个深度学习的开源框架，一边阅读，一边做笔记。不当之处，望斧正。

开始正题！
根据MNN源码所提供的demo PictureRecognition.cpp可以了解推理的流程。

根据模型文件创建解释器Interpreter。
根据ScheduleConfig配置创建会话Session。
将数据填入session的input tensor中。
run session。
从output tensor获取推理结果。

我们也通过这个流程一层层的逐渐深入阅读整理MNN的源码。

解释器 Interpreter

定义

先来看Interpreter的定义。

class MNN_PUBLIC Interpreter {
public:
    /**
     * @brief create net from file.
     * @param file  given file.
     * @return created net if success, NULL otherwise.
     */
    static Interpreter* createFromFile(const char* file);
    /**
     * @brief create net from buffer.
     * @param buffer    given data buffer.
     * @param size      size of data buffer.
     * @return created net if success, NULL otherwise.
     */
    static Interpreter* createFromBuffer(const void* buffer, size_t size);
    ~Interpreter();

public:
    /**
     * @brief create session with schedule config. created session will be managed in net.
     * @param config session schedule config.
     * @return created session if success, NULL otherwise.
     */
    Session* createSession(const ScheduleConfig& config);

    /**
     * @brief create multi-path session with schedule configs. created session will be managed in net.
     * @param configs session schedule configs.
     * @return created session if success, NULL otherwise.
     */
    Session* createMultiPathSession(const std::vector<ScheduleConfig>& configs);

    /**
     * @brief release session.
     * @param session   given session.
     * @return true if given session is held by net and is freed.
     */
    bool releaseSession(Session* session);

    /**
     * @brief call this function to get tensors ready. output tensor buffer (host or deviceId) should be retrieved
     *        after resize of any input tensor.
     * @param session given session.
     */
    void resizeSession(Session* session);

    /**
     * @brief call this function if don't need resize or create session any more, it will save a few memory that equal
     * to the size of model buffer
     */
    void releaseModel();

    /**
     * @brief Get the model buffer for user to save
     * @return std::make_pair(modleBuffer, modelSize).
     * @example:
     * std::ofstream output("trainResult.alinn")
     * auto buffer = net->getModelBuffer();
     * output.write((const char*)buffer.first, buffer.second);
     */
    std::pair<const void*, size_t> getModelBuffer() const;

    /**
     * @brief update Session's Tensor to model's Const Op
     * @param session   given session.
     * @return result of running.
     */
    ErrorCode updateSessionToModel(Session* session);

    /**
     * @brief run session.
     * @param session   given session.
     * @return result of running.
     */
    ErrorCode runSession(Session* session) const;

    /*
     * @brief run session.
     * @param session   given session.
     * @param before    callback before each op. return true to run the op; return false to skip the op.
     * @param after     callback after each op. return true to continue running; return false to interrupt the session.
     * @param sync      synchronously wait for finish of execution or not.
     * @return result of running.
     */
    ErrorCode runSessionWithCallBack(const Session* session, const TensorCallBack& before, const TensorCallBack& end,
                                     bool sync = false) const;

    /*
     * @brief run session.
     * @param session   given session.
     * @param before    callback before each op. return true to run the op; return false to skip the op.
     * @param after     callback after each op. return true to continue running; return false to interrupt the session.
     * @param sync      synchronously wait for finish of execution or not.
     * @return result of running.
     */
    ErrorCode runSessionWithCallBackInfo(const Session* session, const TensorCallBackWithInfo& before,
                                         const TensorCallBackWithInfo& end, bool sync = false) const;

    /**
     * @brief get input tensor for given name.
     * @param session   given session.
     * @param name      given name. if NULL, return first input.
     * @return tensor if found, NULL otherwise.
     */
    Tensor* getSessionInput(const Session* session, const char* name);
    /**
     * @brief get output tensor for given name.
     * @param session   given session.
     * @param name      given name. if NULL, return first output.
     * @return tensor if found, NULL otherwise.
     */
    Tensor* getSessionOutput(const Session* session, const char* name);

    /**
     * @brief get all input tensors.
     * @param session   given session.
     * @return all input tensors mapped with name.
     */
    const std::map<std::string, Tensor*>& getSessionOutputAll(const Session* session) const;
    /**
     * @brief get all output tensors.
     * @param session   given session.
     * @return all output tensors mapped with name.
     */
    const std::map<std::string, Tensor*>& getSessionInputAll(const Session* session) const;

public:
    /**
     * @brief resize given tensor.
     * @param tensor    given tensor.
     * @param dims      new dims. at most 6 dims.
     */
    void resizeTensor(Tensor* tensor, const std::vector<int>& dims);

    /**
     * @brief resize given tensor by nchw.
     * @param batch  / N.
     * @param channel   / C.
     * @param height / H.
     * @param width / W
     */
    void resizeTensor(Tensor* tensor, int batch, int channel, int height, int width);

    /**
     * @brief get backend used to create given tensor.
     * @param session   given session.
     * @param tensor    given tensor.
     * @return backend used to create given tensor, may be NULL.
     */
    const Backend* getBackend(const Session* session, const Tensor* tensor) const;

    /**
     * @brief get business code (model identifier).
     * @return business code.
     */
    const char* bizCode() const;

private:
    static Interpreter* createFromBufferInternal(Content* net);

    Content* mNet = nullptr;
    Interpreter(Content* net);

    Interpreter(const Interpreter&)  = delete;
    Interpreter(const Interpreter&&) = delete;
    Interpreter& operator=(const Interpreter&) = delete;
    Interpreter& operator=(const Interpreter&&) = delete;
};

能看到执行推理过程的相关函数，创建session，执行session，输入输出等等。可见MNN整个推理流程都是依赖于Interpreter的。且Interpreter是个单例。

创建

看下Interpreter是怎么通过模型文件创建的。

Interpreter* Interpreter::createFromFile(const char* file) {
    if (nullptr == file) {
        MNN_PRINT("NULL file for create interpreter\n");
        return nullptr;
    }
    std::unique_ptr<FileLoader> loader(new FileLoader(file));
    if (!loader->valid()) {
        MNN_PRINT("Create interpreter failed, open %s error\n", file);
        return nullptr;
    }
    bool result = loader->read();
    if (!result) {
        MNN_PRINT("Read file error\n");
        return nullptr;
    }
    if (loader->size() == 0) {
        MNN_PRINT("Create interpreter failed, %s is empty\n", file);
        return nullptr;
    }
    auto net     = new Content;		// 承载模型数据的结构体
    bool success = loader->merge(net->buffer);
    if (!success) {
        return nullptr;
    }
    loader.reset();
    return createFromBufferInternal(net);
}

这里使用了一个结构体承载模型数据，一些概念暂时无法具体理解，会随着代码的深入逐渐明白。

struct Content {
    AutoStorage<uint8_t> buffer;	// 从模型文件中读取到的原始数据
    const Net* net = nullptr;		// 真正包含网络结构和参数的数据结构
    std::vector<std::unique_ptr<Session>> sessions;		// session 集合，有可能一个模型多个session
    std::map<const Tensor*, const Session*> tensorMap;	// 数据到session的映射关系
    Interpreter::SessionMode callBackMode = Interpreter::Session_Debug;
    Interpreter::SessionMode inputMode    = Interpreter::Session_Input_Inside;
    AutoStorage<uint8_t> cacheBuffer;
    size_t cacheOffset = 0;
    std::string cacheFile;
    std::mutex lock;
};

在读取完模型文件中的数据后，通过createFromBufferInternal函数创建解释器。

Interpreter* Interpreter::createFromBufferInternal(Content* net) {
    if (nullptr == net) {
        MNN_PRINT("Buffer is null for create interpreter\n");
        return nullptr;
    }
   /**
     * 通过flatbuffers对模型参数进行反序列化，填充进Content中的net中。
     *
     */
    flatbuffers::Verifier verify((const uint8_t*)(net->buffer.get()), net->buffer.size());
    if (false == VerifyNetBuffer(verify)) {
        MNN_PRINT("Invalidate buffer to create interpreter\n");
        delete net;
        return nullptr;
    }
    net->net = GetNet(net->buffer.get());
    if (nullptr == net->net->oplists()) {
        MNN_ERROR("Model has no oplist\n");
        delete net;
        return nullptr;
    }
    int opSize = net->net->oplists()->size();
    for (int i = 0; i < opSize; ++i) {
        auto op = net->net->oplists()->GetAs<Op>(i);
        if (nullptr == op || nullptr == op->outputIndexes()) {
            MNN_ERROR("Invalid Model, the %d op is empty\n", i);
            delete net;
            return nullptr;
        }
    }
    return new Interpreter(net);
}

flatbuffers库可以将结构体数据序列化成buffer的形式，也可以进行反序列化，根据buffer填充结构体。
可以参考https://blog.csdn.net/hsqyc/article/details/115719054理解。

最终构造函数如下：

Interpreter::Interpreter(Content* net) {
    MNN_ASSERT(nullptr != net);
    mNet = net;
}

至于Net到底是什么，看下面的代码：

struct Net FLATBUFFERS_FINAL_CLASS : private flatbuffers::Table {
  typedef NetT NativeTableType;
  static const flatbuffers::TypeTable *MiniReflectTypeTable() {
    return NetTypeTable();
  }
  enum FlatBuffersVTableOffset FLATBUFFERS_VTABLE_UNDERLYING_TYPE {
    VT_BIZCODE = 4,
    VT_EXTRATENSORDESCRIBE = 6,
    VT_GPULIBRARY = 8,
    VT_OPLISTS = 10,
    VT_OUTPUTNAME = 12,
    VT_PREFERFORWARDTYPE = 14,
    VT_SOURCETYPE = 16,
    VT_TENSORNAME = 18,
    VT_TENSORNUMBER = 20,
    VT_USAGE = 22,
    VT_SUBGRAPHS = 24
  };
  ... 省略 ...
};

fb文件中的定义如下：

/**
 * Net 结构
**/
table Net {
    bizCode: string;			// 会在转换成MNN的模型时指定
    extraTensorDescribe: [TensorDescribe];
    gpulibrary: GpuLibrary;
    oplists: [Op];
    outputName: [string];
    preferForwardType: ForwardType = CPU;
    sourceType: NetSource = CAFFE;
    tensorName: [string];
    tensorNumber: int = 0;
    usage:Usage = INFERENCE;  // used to more compatibility in future

    // Subgraphs of the Net. 模型的图结构
    subgraphs: [SubGraphProto];
}

table TensorDescribe {
    blob: Blob;
    index: int;
    name: string;
    regions:[Region];
}

table SubGraphProto {
    // Subgraph unique name.
    name: string;

    // The ids of input tensors.
    inputs: [int];

    // The ids of output tensors.
    outputs: [int];

    // All tensor names.
    // The id of each tensor is the index in the vector names.
    tensors: [string];

    // Nodes of the subgraph. Op就相当于每一层的操作，卷积，池化，ReLU等
    nodes: [Op];
}

弹与征鸿

关注

1
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
MNN源码阅读之推理流程（一）——Interpreter

根据MNN源码所提供的demo可以考出推理的流程。根据模型文件创建解释器Interpreter。根据ScheduleConfig配置创建会话Session。将数据填入session的input tensor中。run session。从output tensor获取推理结果。1. Interpreter先来看Interpreter的定义。class MNN_PUBLIC Interpreter {public: /** * @brief create net from
复制链接

扫一扫