mace micro 代码分析: netdef 和Graph文件的生成

最新推荐文章于 2023-05-10 11:26:46 发布

shuai_wen

最新推荐文章于 2023-05-10 11:26:46 发布

阅读量414

点赞数

分类专栏：人工智能

本文链接：https://blog.csdn.net/u011279649/article/details/110231639

版权

人工智能专栏收录该内容

159 篇文章 14 订阅

订阅专栏

文件micro_net_def_data.h的内容就是class NetDef的一个对象object: 可以通过分析文件micro_net_def_data.h的byte 内容，分析其意义，理解NetDef文件保存了什么信息。

测试方法：推断一个输入数据

./micro/tools/cmake/cmake-build-host.sh -DMACE_MICRO_ENABLE_EXAMPLES=ON -DMACE_MICRO_ENABLE_TOOLS=OFF -DMICRO_MODEL_NAME=har -DMICRO_DATA_NAME=har

代码用了好多宏定义，展开的方法

g++ -E micro/model/operator_def.cc -o operator_def.i -I.

读取文件内容的代码

int main()
{
for (int j=0; j<24; j++) {
for (int i=0; i<8; i++) {
printf("%02x ", kNetDef[0x2c + 0x5a4 + 0x98c+ 8*j+i]);
//printf("%02x ", kGraphData[0xb4+8*j + i]);
}
printf("\n");
}
return 0;
}

NetDef对应的pb文件message 和对应的class: data member and 获得model信息的function

pb文件定义

message NetDef {
repeated OperatorDef op = 1;
repeated Argument arg = 2;
repeated ConstTensor tensors = 3;
optional DataType data_type = 4 [default = DT_FLOAT];

repeated InputOutputInfo input_info = 100;
repeated InputOutputInfo output_info = 101;
}

对应class实现

class NetDef
private:
SerialArray<OperatorDef> ops_;
SerialArray<Argument> args_;
SerialArray<ConstTensor> tensors_;
SerialInt32 data_type_;
SerialArray<InputOutputInfo> input_infos_;
SerialArray<InputOutputInfo> output_infos_;
};

对应文件内容分析

SerialArray是基本class 表示repeart的pb message: size 和offset类似于指针：间接寻址

SerialArray是一个 template struct这里的typename T的T 并没有在struct中出现，会生成多个不同名称的SerialArray?

template<typename T>
struct SerialArray {
offset_size_t size_;
offset_size_t offset_;
SerialArray() : size_(0), offset_(0) {}
};

SerialArray<xxx>是个class,且默认构造函数使用初始化列表初始化变量为0

另外，protocol buffer的repeat 是个数组，表示为SerialArray: size 和offset (在对象内的offset, 不是从文件开头算起的offset)

class NetDef一个实例化对象对应的内容

class NetDef: 一个NetDef包括ops/ args/ const tensor/ data_type/ input/output/info
private:
SerialArray<OperatorDef> ops_;
SerialArray<Argument> args_;
SerialArray<ConstTensor> tensors_;
SerialInt32 data_type_;
SerialArray<InputOutputInfo> input_infos_;
SerialArray<InputOutputInfo> output_infos_;
};

对着上面这个class的data member可以看出

kNetDef的开头是 class NetDef赋值的data member

ops_: 8个算子OperatorDef
08 00 00 00 2c 00 00 00
args_
03 00 00 00 8c 02 00 00
tensors_
09 00 00 00 04 03 00 00
data_type_: DT_FLOAT = 1：直接是内容
01 00 00 00
input_infos_
01 00 00 00 68 05 00 00
output_infos_
01 00 00 00 90 05 00 00

下面分析下简单的InputOutputInfo

描述输入输出使用的是同一个class: InputOutputInfo: 内容是名字，形状：yml文件里设定的

message InputOutputInfo {
optional string name = 1;
optional int32 node_id = 2;
repeated int32 dims = 3;
optional int32 max_byte_size = 4; // only support 32-bit len
optional DataType data_type = 5 [default = DT_FLOAT];
optional int32 data_format = 6 [default = 1]; // NHWC
optional float scale = 7;
optional int32 zero_point = 8;
}

class InputOutputInfo

private:
SerialString name_; //tensor name
SerialInt32 node_id_;
SerialArray<SerialInt32> dims_;
SerialInt32 max_byte_size_;
SerialInt32 data_type_;
SerialInt32 data_format_;
SerialFloat scale_;
SerialInt32 zero_point_;
};

01 00 00 00 68 05 00 00 (输入)

有一个input

10 00 00 00 88 09 00 00：name: len:0x10, offset in InputOutputInfo: 0x0988: InputOutputInfo in NetDef的offset 是0x0568: kNetDef[0x568+0x988+...]对应的内容与是63 6f 6e 76 32 64 5f 69: "conv2d_input:0"
00 00 00 00: node_id: 0
04 00 00 00 98 09 00 00: dims:4, dims的内容:kNetDef[0x568+0x998+..], shape: 01 00 00 00, 1c 00 00 00, 1c 00 00 00 01 00 00 00(1, 28, 28, 1)
00 00 00 00: max_byte_size
01 00 00 00: data_type输入数据类型
01 00 00 00: data format
00 00 00 00: scale
00 00 00 00: zero_point

01 00 00 00 90 05 00 00(输出)

14 00 00 00 80 09 00 00: name kNetDef[0x590 + 0x980+..]: 64 65 6e 73 65 5f 31 2f 53 6f 66 74 6d 61 78 3a 30: "dense_1/Softmax:0"
00 00 00 00 : node_id没有意义？
02 00 00 00 94 09 00 00: dims: 01 00 00 00 0a 00 00 00: (1, 10)
00 00 00 00
01 00 00 00
01 00 00 00
00 00 00 00
00 00 00 00

NetDef的OpetatorDef

message OperatorDef {
repeated string input = 1;
repeated string output = 2;
optional string name = 3;
optional string type = 4;
optional int32 device_type = 5;
repeated Argument arg = 6;
repeated OutputShape output_shape = 7;
repeated DataType output_type = 8;
repeated QuantizeActivationInfo quantize_info = 9;

// for mace it is mem_id, for micro, it is mem_offset
repeated int32 mem_id = 10;

// for hexagon mace-nnlib
optional uint32 node_id = 100;
optional uint32 op_id = 101;
optional uint32 padding = 102;
repeated NodeInput node_input = 103;
repeated int32 out_max_byte_size = 104; // only support 32-bit len
}

class OperatorDef

private:
SerialArray<SerialString> inputs_;
SerialArray<SerialString> outputs_;
SerialString name_;
SerialString type_;

SerialInt32 device_type_;
SerialArray<Argument> args_;
SerialArray<OutputShape> output_shapes_; //为什么要描述每个op的 output shape 和datatype是为了分配内存？
SerialArray<DataType> output_types_;
SerialArray<QuantizeActivationInfo> quantize_info_;
SerialArray<SerialInt32> mem_offsets_;
};

ops_: 8个算子OperatorDef
08 00 00 00 2c 00 00 00

NetDef里包含了8个OperatorDef对象，从0x2C开始依次layout 8个NetDef对象。如果能找到OperatorDef中的inputs的具体内容，其他都是同样的方法。注意要加上上级的offset还有到array开头的offset

0x2c处对应的内容：

03 00 00 00 8c 05 00 00 : (0x2c+0x58c)[要加上OpDef在NetDef中offset:0x2c]的内容是依次layout的3个SerialString: 10 00 00 00 74 09 00 00, 10 00 00 00 7c 09 00 00, 10 00 00 00 84 09 00 00

input1的字符串内容: 0x2c+ 0x58c + 0x974: "conv2d_input:0", input2: 0x2c + 0x58c+ 0x97c + 8(还要有个数组索引offset): "conv2d/kernel:0": input3: 0x2c + 0x58c+ 0x984 + 2*8: "conv2d/bias:0"

01 00 00 00 a4 05 00 00 :outputs_ ： "conv2d/Relu:0"
0c 00 00 00 ac 05 00 00 : name： "conv2d_act"
08 00 00 00 b8 05 00 00 :type： "Conv2D"
00 00 00 00 : device type

07 00 00 00 c0 05 00 00 : args

01 00 00 00 d8 06 00 00 : output_shapes

00 00 00 00 38 00 00 00 :output_types_ 个数是0，还有offset?

00 00 00 00 40 00 00 00 : quantize_info_个数是0

01 00 00 00 e0 06 00 00 : mem_offsets_ 1个memory: 0x2c+0x6e0: 0 这个memory是用来存放输出的。

Opdef 输出outputshape

class OutputShape : public Serialize {
private:
SerialArray<SerialInt32> dims_;
};

0x2c + 0x6d8: 04 00 00 00 c8 08 00 00: 4个 int32

0x2c + 0x6d8 + 0x8c8: (outpushape: 1, 28, 28, 32) 01 00 00 00 1c 00 00 00 1c 00 00 00 20 00 00 00

OpDef Argument

class Argument : public Serialize {

private:
SerialString name_;
SerialFloat f_;
SerialInt32 i_;
SerialBytes s_;
SerialArray<SerialFloat> floats_;
SerialArray<SerialInt32> ints_;
};

07 00 00 00 c0 05 00 00 : args

1]
04 00 00 00 80 09 00 00 :T
00 00 00 00 01 00 00 00
00 00 00 00 14 00 00 00
00 00 00 00 1c 00 00 00
00 00 00 00 24 00 00 00
2]
10 00 00 00 5c 09 00 00 : 0x2c + 0x5c0 +0x95c+ 5*8(数组开始到这里的offset): "framework_type"
00 00 00 00 04 00 00 00
00 00 00 00 14 00 00 00
00 00 00 00 1c 00 00 00
00 00 00 00 24 00 00 00
3]
0c 00 00 00 44 09 00 00 :"data_format"
00 00 00 00 e8 03 00 00
00 00 00 00 14 00 00 00
00 00 00 00 1c 00 00 00
00 00 00 00 24 00 00 00
4]
08 00 00 00 28 09 00 00 :"padding"
00 00 00 00 01 00 00 00
00 00 00 00 14 00 00 00
00 00 00 00 1c 00 00 00
00 00 00 00 24 00 00 00
5]
08 00 00 00 08 09 00 00 "strides": 0x2c + 0x5c0 +0x908+ 4*5*8
00 00 00 00 00 00 00 00
00 00 00 00 14 00 00 00
00 00 00 00 1c 00 00 00
02 00 00 00 10 09 00 00 :[0x2c + 0x5c0 +0x910 + 4*5*8]: 01 00 00 00, 01 00 00 00
6]
0c 00 00 00 f0 08 00 00 : "dilations"
00 00 00 00 00 00 00 00
00 00 00 00 14 00 00 00
00 00 00 00 1c 00 00 00
02 00 00 00 fc 08 00 00
7]
0c 00 00 00 dc 08 00 00 :"activation"
00 00 00 00 00 00 00 00
08 00 00 00 e8 08 00 00 : "RELU"
00 00 00 00 1c 00 00 00
00 00 00 00 24 00 00 00

class CostTensor

private:
SerialArray<SerialInt32> dims_;
DataType data_type_;
SerialArray<SerialFloat> float_datas_;
SerialArray<SerialInt32> int32_datas_;
SerialString name_;
SerialInt32 offset_;
SerialInt32 data_size_;
SerialFloat scale_;
SerialInt32 zero_point_;
SerialFloat minval_;
SerialFloat maxval_;
SerialBool quantized_;
SerialUint32 node_id_;
};

tensors {
dims: 32
dims: 1
dims: 3
dims: 3
data_type: DT_FLOAT
name: "conv2d/kernel:0"
offset: 0
data_size: 288
}

09 00 00 00 04 03 00 00: 9个const tensor object

04 00 00 00 f8 0a 00 00 : dims_ (32, 3, 3, 1) 和print的不一样？特殊处理了
01 00 00 00 : data type(float)

00 00 00 00 10 00 00 00

00 00 00 00 18 00 00 00

10 00 00 00 08 0b 00 00 : string name ： 0x304+ 0xb08： "conv2d/kernel:0"

00 00 00 00 20 01 00 00 : offset(0), data size(0x120)

00 00 00 00 00 00 00 00 :scale, zero

00 00 00 00 00 00 00 00 : min max

00 00 00 00 00 00 00 00 : quantized, node id

Graph data 文件

message OpContext {
optional int32 op_idx = 1;
// The input info of downstream operator is the output info of upstream
// operator, so there is no output info defined here
repeated uint32 input_infos = 2;
repeated OutputShape output_resize_shapes = 3;
}

message Graph {
repeated OpContext op_contexts = 1;
repeated uint32 input_op_idxs = 2;
// The output info of the last operator, which is not recorded in opcontext,
// is the output of graph
repeated uint32 output_infos = 3;
}

//--------------------------------------------------------------
void Serialize::Uint2OpIOInfo(const OpIOInfo *info) const {
//const_cast修改类型的const/volatile属性
OpIOInfo *io_info = const_cast<OpIOInfo *>(info);
//按uint32_t解释*info指向的内容，处理后更改OpIOInfo的内容
uint32_t info_data = *(reinterpret_cast<uint32_t *>(io_info));
io_info->op_def_idx_ = (info_data & 0xffff0000) >> 16;//高16位: def_idx_
io_info->output_idx_ = (info_data & 0x0000ffff);//低16位： output_idx
}

struct OpIOInfo //哪个Op, op的第几个output? { 这里的命名有点问题，比较令人困惑，其实他们的意思是
uint16_t op_def_idx_;
uint16_t output_idx_;
}

Operator的input虽然都用 data struct: OpIOInfo表示，但意义还是差别挺大的
tensor分为三类：
1. OpIOInfo表示model的input时: op_def_idx:是固定值kIdxModelInput: 0xfffe,
output_idx_表示的是input的Index(model可以有多个input)
2. OpIOInfo表示model的训练参数时: op_def_idx:是固定值kIdxConstTensor: 0xffff,
output_idx_表示的是net_def_->tensor(input_info->output_idx_)
3. OpIOInfo表示是某个Operator的输出时:op_def_idx是operator的Index: engine_config_->net_def_->op(op_def_idx)
output_idx_表示的是operator的输出index(一个operator可以有多个输出)

class OutputShape : public Serialize {
private:
SerialArray<SerialInt32> dims_;
}

//oprator的描述: opIndex, inputs(OpIoInfo), OutputShape
class OpContext : public Serialize {
protected:
SerialUint32 op_idx_;
SerialArray<OpIOInfo> input_infos_;
SerialArray<model::OutputShape> output_resize_shapes_;
}

//Graph的描述: opContexts/inpuOPIndex/outputInfos
class Graph : public Serialize {
protected:
SerialArray<OpContext> op_contexts_;
SerialArray<SerialUint32> input_op_idxs_;
SerialArray<OpIOInfo> output_infos_;
};

0x00:
09 00 00 00 18 00 00 00 //OpContext: size:9, offset:0x18
01 00 00 00 cc 00 00 00 //SerialUint32: size: 1, offset:0xcc
01 00 00 00 d0 00 00 00 //OpIOInfo: size: 1, offset 0xd0

0xcc:
00 00 00 00

0xd0:OpIOInfo
00 00 08 00 //转换为int32: 00 08 00 00 ->08: op_index, output_index(输出的index)

//第一个OpContext
0x18: OpContext
00 00 00 00 //0
03 00 00 00 bc 00 00 00
01 00 00 00 c8 00 00 00

0x18 + 0xbc:OpIOInfo

00 00 fe ff //ff fe 00 00
00 00 ff ff //ff ff 00 00
01 00 ff ff //ff ff 00 01

0x18 + 0xc8:model::OutputShape
04 00 00 00
84 00 00 00
0x18 + 0xc8+ 0x84: [1, 89, 2, 128]
01 00 00 00
59 00 00 00
02 00 00 00
80 00 00 00

//第2个OpContext
01 00 00 00 //1
01 00 00 00 bc 00 00 00
01 00 00 00 c0 00 00 00

0x18+ 5*8 +0xbc:OpIOInfo
00 00 00 00:-> 00 00 00 00: 第0个op的第0个output
0x18+ 5*8 +0xc0:
04 00 00 00 88 00 00 00
[1, 44, 1, 128]
01 00 00 00
2c 00 00 00
01 00 00 00
80 00 00 00

//第3个OpContext
02 00 00 00 //2
02 00 00 00 b4 00 00 00
01 00 00 00 bc 00 00 00

00 00 01 00: 00 01 00 00
08 00 ff ff

02 00 00 00 88 00 00 00
0x18 + 2*5*4 + 0xbc + 0x88:
01 00 00 00
00 16 00 00