第一次阅读整个深度学习的开源框架,一边阅读,一边做笔记。不当之处,望斧正。
开始正题!
根据MNN源码所提供的demo PictureRecognition.cpp可以了解推理的流程。
- 根据模型文件创建解释器Interpreter。
- 根据ScheduleConfig配置创建会话Session。
- 将数据填入session的input tensor中。
- run session。
- 从output tensor获取推理结果。
我们也通过这个流程一层层的逐渐深入阅读整理MNN的源码。
解释器 Interpreter
定义
先来看Interpreter的定义。
class MNN_PUBLIC Interpreter {
public:
/**
* @brief create net from file.
* @param file given file.
* @return created net if success, NULL otherwise.
*/
static Interpreter* createFromFile(const char* file);
/**
* @brief create net from buffer.
* @param buffer given data buffer.
* @param size size of data buffer.
* @return created net if success, NULL otherwise.
*/
static Interpreter* createFromBuffer(const void* buffer, size_t size);
~Interpreter();
public:
/**
* @brief create session with schedule config. created session will be managed in net.
* @param config session schedule config.
* @return created session if success, NULL otherwise.
*/
Session* createSession(const ScheduleConfig& config);
/**
* @brief create multi-path session with schedule configs. created session will be managed in net.
* @param configs session schedule configs.
* @return created session if success, NULL otherwise.
*/
Session* createMultiPathSession(const std::vector<ScheduleConfig>& configs);
/**
* @brief release session.
* @param session given session.
* @return true if given session is held by net and is freed.
*/
bool releaseSession(Session* session);
/**
* @brief call this function to get tensors ready. output tensor buffer (host or deviceId) should be retrieved
* after resize of any input tensor.
* @param session given session.
*/
void resizeSession(Session* session);
/**
* @brief call this function if don't need resize or create session any more, it will save a few memory that equal
* to the size of model buffer
*/
void releaseModel();
/**
* @brief Get the model buffer for user to save
* @return std::make_pair(modleBuffer, modelSize).
* @example:
* std::ofstream output("trainResult.alinn")
* auto buffer = net->getModelBuffer();
* output.write((const char*)buffer.first, buffer.second);
*/
std::pair<const void*, size_t> getModelBuffer() const;
/**
* @brief update Session's Tensor to model's Const Op
* @param session given session.
* @return result of running.
*/
ErrorCode updateSessionToModel(Session* session);
/**
* @brief run session.
* @param session given session.
* @return result of running.
*/
ErrorCode runSession(Session* session) const;
/*
* @brief run session.
* @param session given session.
* @param before callback before each op. return true to run the op; return false to skip the op.
* @param after callback after each op. return true to continue running; return false to interrupt the session.
* @param sync synchronously wait for finish of execution or not.
* @return result of running.
*/
ErrorCode runSessionWithCallBack(const Session* session, const TensorCallBack& before, const TensorCallBack& end,
bool sync = false) const;
/*
* @brief run session.
* @param session given session.
* @param before callback before each op. return true to run the op; return false to skip the op.
* @param after callback after each op. return true to continue running; return false to interrupt the session.
* @param sync synchronously wait for finish of execution or not.
* @return result of running.
*/
ErrorCode runSessionWithCallBackInfo(const Session* session, const TensorCallBackWithInfo& before,
const TensorCallBackWithInfo& end, bool sync = false) const;
/**
* @brief get input tensor for given name.
* @param session given session.
* @param name given name. if NULL, return first input.
* @return tensor if found, NULL otherwise.
*/
Tensor* getSessionInput(const Session* session, const char* name);
/**
* @brief get output tensor for given name.
* @param session given session.
* @param name given name. if NULL, return first output.
* @return tensor if found, NULL otherwise.
*/
Tensor* getSessionOutput(const Session* session, const char* name);
/**
* @brief get all input tensors.
* @param session given session.
* @return all input tensors mapped with name.
*/
const std::map<std::string, Tensor*>& getSessionOutputAll(const Session* session) const;
/**
* @brief get all output tensors.
* @param session given session.
* @return all output tensors mapped with name.
*/
const std::map<std::string, Tensor*>& getSessionInputAll(const Session* session) const;
public:
/**
* @brief resize given tensor.
* @param tensor given tensor.
* @param dims new dims. at most 6 dims.
*/
void resizeTensor(Tensor* tensor, const std::vector<int>& dims);
/**
* @brief resize given tensor by nchw.
* @param batch / N.
* @param channel / C.
* @param height / H.
* @param width / W
*/
void resizeTensor(Tensor* tensor, int batch, int channel, int height, int width);
/**
* @brief get backend used to create given tensor.
* @param session given session.
* @param tensor given tensor.
* @return backend used to create given tensor, may be NULL.
*/
const Backend* getBackend(const Session* session, const Tensor* tensor) const;
/**
* @brief get business code (model identifier).
* @return business code.
*/
const char* bizCode() const;
private:
static Interpreter* createFromBufferInternal(Content* net);
Content* mNet = nullptr;
Interpreter(Content* net);
Interpreter(const Interpreter&) = delete;
Interpreter(const Interpreter&&) = delete;
Interpreter& operator=(const Interpreter&) = delete;
Interpreter& operator=(const Interpreter&&) = delete;
};
能看到执行推理过程的相关函数,创建session,执行session,输入输出等等。可见MNN整个推理流程都是依赖于Interpreter的。且Interpreter是个单例。
创建
看下Interpreter是怎么通过模型文件创建的。
Interpreter* Interpreter::createFromFile(const char* file) {
if (nullptr == file) {
MNN_PRINT("NULL file for create interpreter\n");
return nullptr;
}
std::unique_ptr<FileLoader> loader(new FileLoader(file));
if (!loader->valid()) {
MNN_PRINT("Create interpreter failed, open %s error\n", file);
return nullptr;
}
bool result = loader->read();
if (!result) {
MNN_PRINT("Read file error\n");
return nullptr;
}
if (loader->size() == 0) {
MNN_PRINT("Create interpreter failed, %s is empty\n", file);
return nullptr;
}
auto net = new Content; // 承载模型数据的结构体
bool success = loader->merge(net->buffer);
if (!success) {
return nullptr;
}
loader.reset();
return createFromBufferInternal(net);
}
这里使用了一个结构体承载模型数据,一些概念暂时无法具体理解,会随着代码的深入逐渐明白。
struct Content {
AutoStorage<uint8_t> buffer; // 从模型文件中读取到的原始数据
const Net* net = nullptr; // 真正包含网络结构和参数的数据结构
std::vector<std::unique_ptr<Session>> sessions; // session 集合,有可能一个模型多个session
std::map<const Tensor*, const Session*> tensorMap; // 数据到session的映射关系
Interpreter::SessionMode callBackMode = Interpreter::Session_Debug;
Interpreter::SessionMode inputMode = Interpreter::Session_Input_Inside;
AutoStorage<uint8_t> cacheBuffer;
size_t cacheOffset = 0;
std::string cacheFile;
std::mutex lock;
};
在读取完模型文件中的数据后,通过createFromBufferInternal函数创建解释器。
Interpreter* Interpreter::createFromBufferInternal(Content* net) {
if (nullptr == net) {
MNN_PRINT("Buffer is null for create interpreter\n");
return nullptr;
}
/**
* 通过flatbuffers对模型参数进行反序列化,填充进Content中的net中。
*
*/
flatbuffers::Verifier verify((const uint8_t*)(net->buffer.get()), net->buffer.size());
if (false == VerifyNetBuffer(verify)) {
MNN_PRINT("Invalidate buffer to create interpreter\n");
delete net;
return nullptr;
}
net->net = GetNet(net->buffer.get());
if (nullptr == net->net->oplists()) {
MNN_ERROR("Model has no oplist\n");
delete net;
return nullptr;
}
int opSize = net->net->oplists()->size();
for (int i = 0; i < opSize; ++i) {
auto op = net->net->oplists()->GetAs<Op>(i);
if (nullptr == op || nullptr == op->outputIndexes()) {
MNN_ERROR("Invalid Model, the %d op is empty\n", i);
delete net;
return nullptr;
}
}
return new Interpreter(net);
}
flatbuffers库可以将结构体数据序列化成buffer的形式,也可以进行反序列化,根据buffer填充结构体。
可以参考https://blog.csdn.net/hsqyc/article/details/115719054理解。
最终构造函数如下:
Interpreter::Interpreter(Content* net) {
MNN_ASSERT(nullptr != net);
mNet = net;
}
至于Net到底是什么,看下面的代码:
struct Net FLATBUFFERS_FINAL_CLASS : private flatbuffers::Table {
typedef NetT NativeTableType;
static const flatbuffers::TypeTable *MiniReflectTypeTable() {
return NetTypeTable();
}
enum FlatBuffersVTableOffset FLATBUFFERS_VTABLE_UNDERLYING_TYPE {
VT_BIZCODE = 4,
VT_EXTRATENSORDESCRIBE = 6,
VT_GPULIBRARY = 8,
VT_OPLISTS = 10,
VT_OUTPUTNAME = 12,
VT_PREFERFORWARDTYPE = 14,
VT_SOURCETYPE = 16,
VT_TENSORNAME = 18,
VT_TENSORNUMBER = 20,
VT_USAGE = 22,
VT_SUBGRAPHS = 24
};
... 省略 ...
};
fb文件中的定义如下:
/**
* Net 结构
**/
table Net {
bizCode: string; // 会在转换成MNN的模型时指定
extraTensorDescribe: [TensorDescribe];
gpulibrary: GpuLibrary;
oplists: [Op];
outputName: [string];
preferForwardType: ForwardType = CPU;
sourceType: NetSource = CAFFE;
tensorName: [string];
tensorNumber: int = 0;
usage:Usage = INFERENCE; // used to more compatibility in future
// Subgraphs of the Net. 模型的图结构
subgraphs: [SubGraphProto];
}
table TensorDescribe {
blob: Blob;
index: int;
name: string;
regions:[Region];
}
table SubGraphProto {
// Subgraph unique name.
name: string;
// The ids of input tensors.
inputs: [int];
// The ids of output tensors.
outputs: [int];
// All tensor names.
// The id of each tensor is the index in the vector names.
tensors: [string];
// Nodes of the subgraph. Op就相当于每一层的操作,卷积,池化,ReLU等
nodes: [Op];
}