TVM调试器

最新推荐文章于 2022-09-06 13:29:05 发布

zxros10

最新推荐文章于 2022-09-06 13:29:05 发布

阅读量558

点赞数

分类专栏： TVM官方文档翻译文章标签： java servlet 开发语言

原文链接：https://tvm.apache.org/docs/arch/debugger.html

版权

TVM官方文档翻译专栏收录该内容

29 篇文章 9 订阅

订阅专栏

TVM调试器是一个调试TVM计算图执行的接口。它有助于在TVM运行时提供对图结构和张量值的访问。

调试交换格式

1.计算图

图经过relay优化后，以json序列化格式存储。json文件中包含了图的全部信息。UX可以直接使用这个图，也可以将这个图转换成UX可以理解的格式。

下面将解释Graph JSON格式

1. nodes:在json中，节点代表占位符或计算节点。节点存储为一个列表。节点包含以下信息：

op: 操作类型，null表示它是一个占位符/变量/输入节点，tvm_op表示该节点可以执行
name: 节点名称
input: 当前操作的输入的位置，Inputs是一个包含(nodeid, index, version)的元组列表。(可选)
attrs: 节点的属性，包含如下信息

（1）flatten_data:是否需要在执行前将数据扁平化

（2）func_name:融合后的函数名，对应于Relay编译过程生成的库中的符号。

（3）num_inputs:此节点的输入个数

（4）num_outputs:此节点产生的输出个数

2. arg_nodes: 节点的索引列表，这些节点包括图的占位符/变量/输入，或常量/参数。

3.heads：图形输出的一组条目。

4. node_row_ptr: 存储前向路径的历史，因此在推断任务中您可以跳过整个图的构建。

5. attrs: 可以包含版本号或类似的有用信息：

（1）storage_id: 存储布局中每个节点的内存槽位号。

（2）dtype: 各节点的数据类型(enum值)。

（3）dltype: 按顺序排列的每个节点的数据类型。

（4）shape: 每个节点的k阶形状。

（5）device_index: 为图中的每个条目分配的设备。

图dump示例：

{
  "nodes": [                                    # List of nodes
    {
      "op": "null",                             # operation type = null, this is a placeholder/variable/input or constant/param node
      "name": "x",                              # Name of the argument node
      "inputs": []                              # inputs for this node, its none since this is an argument node
    },
    {
      "op": "tvm_op",                           # operation type = tvm_op, this node can be executed
      "name": "relu0",                          # Name of the node
      "attrs": {                                # Attributes of the node
        "flatten_data": "0",                    # Whether this data need to be flattened
        "func_name": "fuse_l2_normalize_relu",  # Fused function name, corresponds to the symbol in the lib generated by compilation process
        "num_inputs": "1",                      # Number of inputs for this node
        "num_outputs": "1"                      # Number of outputs this node produces
      },
      "inputs": [[0, 0, 0]]                     # Position of the inputs for this operation
    }
  ],
  "arg_nodes": [0],                             # Which all nodes in this are argument nodes
  "node_row_ptr": [0, 1, 2],                    # Row indices for faster depth first search
  "heads": [[1, 0, 0]],                         # Position of the output nodes for this operation
  "attrs": {                                    # Attributes for the graph
    "storage_id": ["list_int", [1, 0]],         # memory slot id for each node in the storage layout
    "dtype": ["list_int", [0, 0]],              # Datatype of each node (enum value)
    "dltype": ["list_str", [                    # Datatype of each node in order
        "float32",
        "float32"]],
    "shape": ["list_shape", [                   # Shape of each node k order
        [1, 3, 20, 20],
        [1, 3, 20, 20]]],
    "device_index": ["list_int", [1, 1]],       # Device assignment for each node in order
  }
}

2. dump 张量

图执行后返回的张量是tvm.ndarray类型。所有的张量将以二进制字节的序列化格式保存。这些二进制字节可以通过API的load_params接口加载。

with open(path_params, “rb”) as fi:
    loaded_params = bytearray(fi.read())

module.load_params(loaded_params)

如何使用调试器

1. 在config.cmake 中将USE_PROFILER设置为NO

# Whether enable additional graph debug functions
set(USE_PROFILER ON)

2. 使用make命令编译tvm，编译得到libtvm_runtime.so

3. 替换前端脚本中的from tvm.contrib import graph_executor，以导入GraphModuleDebug：from tvm.contrib.debugger.debug_executor import GraphModuleDebug

from tvm.contrib.debugger.debug_executor import GraphModuleDebug
m = GraphModuleDebug(
    lib["debug_create"]("default", dev),
    [dev],
    lib.graph_json,
    dump_root="/tmp/tvmdbg",
)
# set inputs
m.set_input('data', tvm.nd.array(data.astype(dtype)))
m.set_input(**params)
# execute
m.run()
tvm_out = m.get_output(0, tvm.nd.empty(out_shape, dtype)).numpy()

4. 如果当前的 network通过lib.export_library("network.so")导出为外部库，与共享对象文件/动态链接库一样，调试运行时的初始化也会略有不同

lib = tvm.runtime.load_module("network.so")
m = graph_executor.create(lib["get_graph_json"](), lib, dev, dump_root="/tmp/tvmdbg")
# set inputs
m.set_input('data', tvm.nd.array(data.astype(dtype)))
m.set_input(**params)
# execute
m.run()
tvm_out = m.get_output(0, tvm.nd.empty(out_shape, dtype)).numpy()

输出被dump到/tmp文件夹中的临时文件夹, 或创建运行时时指定的文件夹。

示例输出

下面的是调试器示例的输出：

Node Name               Ops                                                                  Time(us)   Time(%)  Start Time       End Time         Shape                Inputs  Outputs
---------               ---                                                                  --------   -------  ----------       --------         -----                ------  -------
1_NCHW1c                fuse___layout_transform___4                                          56.52      0.02     15:24:44.177475  15:24:44.177534  (1, 1, 224, 224)     1       1
_contrib_conv2d_nchwc0  fuse__contrib_conv2d_NCHWc                                           12436.11   3.4      15:24:44.177549  15:24:44.189993  (1, 1, 224, 224, 1)  2       1
relu0_NCHW8c            fuse___layout_transform___broadcast_add_relu___layout_transform__    4375.43    1.2      15:24:44.190027  15:24:44.194410  (8, 1, 5, 5, 1, 8)   2       1
_contrib_conv2d_nchwc1  fuse__contrib_conv2d_NCHWc_1                                         213108.6   58.28    15:24:44.194440  15:24:44.407558  (1, 8, 224, 224, 8)  2       1
relu1_NCHW8c            fuse___layout_transform___broadcast_add_relu___layout_transform__    2265.57    0.62     15:24:44.407600  15:24:44.409874  (64, 1, 1)           2       1
_contrib_conv2d_nchwc2  fuse__contrib_conv2d_NCHWc_2                                         104623.15  28.61    15:24:44.409905  15:24:44.514535  (1, 8, 224, 224, 8)  2       1
relu2_NCHW2c            fuse___layout_transform___broadcast_add_relu___layout_transform___1  2004.77    0.55     15:24:44.514567  15:24:44.516582  (8, 8, 3, 3, 8, 8)   2       1
_contrib_conv2d_nchwc3  fuse__contrib_conv2d_NCHWc_3                                         25218.4    6.9      15:24:44.516628  15:24:44.541856  (1, 8, 224, 224, 8)  2       1
reshape1                fuse___layout_transform___broadcast_add_reshape_transpose_reshape    1554.25    0.43     15:24:44.541893  15:24:44.543452  (64, 1, 1)           2       1

zxros10

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
TVM调试器

TVM调试器是一个调试TVM计算图执行的接口。它有助于在TVM运行时提供对图结构和张量值的访问。图经过relay优化后，以json序列化格式存储。json文件中包含了图的全部信息。UX可以直接使用这个图，也可以将这个图转换成UX可以理解的格式。下面将解释Graph JSON格式1. nodes:在json中，节点代表占位符或计算节点。节点存储为一个列表。节点包含以下信息：（1）flatten_data:是否需要在执行前将数据扁平化（2）func_name:融合后的函数名，对应于Relay编译过程生成的库中的
复制链接

扫一扫

专栏目录