tvm tutorials 记录(0)

个人学习tvm的笔记

在tutorial目录下面有个get_started的目录,里面是很基础使用教程,就像目录说的一样,get start

代码中的注释很多写的很清楚了,这里主要用于记录并扩充

relay_quick_start

完整源码如下,原始代码中使用的resnet18,这里我替换成了mobilenet,因为是在cpu上跑,使用Mobilenet会快一些,此外改成从gpu运行变成cpu运行。

这里代码的目的就是加载一个mobilenet的网络模型,然后对网络模型进行计算图优化,得到优化后的图结构,并运行这个优化后的图。

代码很简单,即使没有注释也能读明白

import numpy as np

from tvm import relay
from tvm.relay import testing
import tvm
from tvm import te
from tvm.contrib import graph_executor
import tvm.testing

# 构建网络
batch_size = 1
num_class = 10
image_shape = (3, 224, 224)
data_shape = (batch_size,) + image_shape
out_shape = (batch_size, num_class)

mod, params = relay.testing.mobilenet.get_workload(
    batch_size=batch_size, image_shape=image_shape
)

# set show_meta_data=True if you want to show meta data
# 打印网络信息
print(mod.astext(show_meta_data=False))

# 编译上面构造好的计算图,并执行优化操作,优化级别为3,llvm表示在cpu上运算
opt_level = 3
target = "llvm"
target_host = "llvm"
with tvm.transform.PassContext(opt_level=opt_level):
    # lib = relay.build(mod, target, params=params)
    lib = relay.build(mod, target=target, target_host=target_host, params=params)
    
# 运行编译好(优化后)的神经网络,并给定一个随机输入
# Run the generate library
# create random input
dev = tvm.cpu(0)
data = np.random.uniform(-1, 1, size=data_shape).astype("float32")
# create module
module = graph_executor.GraphModule(lib["default"](dev))
# set input and parameters
module.set_input("data", data)
# run
module.run()
# get output
out = module.get_output(0, tvm.nd.empty(out_shape)).numpy()

# Print first 10 elements of output
print(out.flatten()[0:10])

# 保存编译好网络模型
# Save and Load Compiled Module
# -----------------------------
# We can also save the graph, lib and parameters into files and load them
# back in deploy environment.

####################################################

# save the graph, lib and params into separate files
from tvm.contrib import utils

temp = utils.tempdir()
path_lib = temp.relpath("deploy_lib.tar")
lib.export_library(path_lib)
print(temp.listdir())

# load the module back.
# 加载网络并运行
loaded_lib = tvm.runtime.load_module(path_lib)
input_data = tvm.nd.array(data)

module = graph_executor.GraphModule(loaded_lib["default"](dev))
module.run(data=input_data)
out_deploy = module.get_output(0).numpy()

# Print first 10 elements of output
print(out_deploy.flatten()[0:10])

# check whether the output from deployed module is consistent with original one
tvm.testing.assert_allclose(out_deploy, out, atol=1e-5)

导入神经网络图

在给出的样例中,导入的网络模型是tvm在testing目录下面提供的网络模型,目录的路径在
python/tvm/relay/testing 下面,提供的网络如下:

dcgan.py # 对抗网络的 gcgan
dqn.py   # 强化学习的 DQN
layers.py  # 里面是conv2d/3d bn 算子,包括不同的数据layout 
mlp.py  # 多层感知机
nat.py  # 皮亚诺算术(不知道,应该是个递推?)
resnet.py  # resnet
synthetic.py # 就是一人造网络
yolo_detection.py # yolo
darknet.py  # darknet
densenet.py  # densenet
inception_v3.py  # inception
init.py  # 参数初始化用 比如Xavier
lstm.py  # lstm
mobilenet.py  # mobile
resnet_3d.py  # resnet3d
squeezenet.py  # squeezenet
vgg.py  # vgg

可见提供的demo还是挺丰富的,其中有部分内容不是网络模型,暂时先跳过

所有的网络模型都提供两个接口,一个是get_net(或者其它搭建好的网络图),一个是get_workload,使用get_workload填入参数可以获取网络模型,以mobilenet为例。

mobilenet.py 文件中实现的get_workload函数如下,类似pytorch,调用mobile_net函数,这个函数同样是在mobilenet.py中实现的一个用于构建mobilenet的函数,最后返回create_workload。

def get_workload(
    batch_size=1, num_classes=10, image_shape=(3, 224, 224), dtype="float32", layout="NCHW"
):
    data_shape = tuple([batch_size] + list(image_shape))
    net = mobile_net(
        num_classes=num_classes,
        data_shape=data_shape,
        dtype=dtype,
        alpha=1.0,
        is_shallow=False,
        layout=layout,
    )
    return create_workload(net)

接下来看看mobile_net是怎么写的,pytorch里面构建网络模型都是使用类似torch.nn.conv2d 来搭建网络层,在tvm里面如何使用tvm实现的接口来搭建网络层呢? 看看mobile_net函数的实现如下:

def mobile_net(
    num_classes=1000,
    data_shape=(1, 3, 224, 224),
    dtype="float32",
    alpha=1.0,
    is_shallow=False,
    layout="NCHW",
):
    """Function to construct a MobileNet"""
    data = relay.var("data", shape=data_shape, dtype=dtype)
    body = conv_block(data, "conv_block_1", int(32 * alpha), strides=(2, 2), layout=layout)
    body = separable_conv_block(
        body, "separable_conv_block_1", int(32 * alpha), int(64 * alpha), layout=layout, dtype=dtype
    )
    body = separable_conv_block(
        body,
        "separable_conv_block_2",
        int(64 * alpha),
        int(128 * alpha),
        downsample=True,
        layout=layout,
        dtype=dtype,
    )
    body = separable_conv_block(
        body,
        "separable_conv_block_3",
        int(128 * alpha),
        int(128 * alpha),
        layout=layout,
        dtype=dtype,
    )
    body = separable_conv_block(
        body,
        "separable_conv_block_4",
        int(128 * alpha),
        int(256 * alpha),
        downsample=True,
        layout=layout,
        dtype=dtype,
    )
    body = separable_conv_block(
        body,
        "separable_conv_block_5",
        int(256 * alpha),
        int(256 * alpha),
        layout=layout,
        dtype=dtype,
    )
    body = separable_conv_block(
        body,
        "separable_conv_block_6",
        int(256 * alpha),
        int(512 * alpha),
        downsample=True,
        layout=layout,
        dtype=dtype,
    )
    if is_shallow:
        body = separable_conv_block(
            body,
            "separable_conv_block_7",
            int(512 * alpha),
            int(1024 * alpha),
            downsample=True,
            layout=layout,
            dtype=dtype,
        )
        body = separable_conv_block(
            body,
            "separable_conv_block_8",
            int(1024 * alpha),
            int(1024 * alpha),
            downsample=True,
            layout=layout,
            dtype=dtype,
        )
    else:
        for i in range(7, 12):
            body = separable_conv_block(
                body,
                "separable_conv_block_%d" % i,
                int(512 * alpha),
                int(512 * alpha),
                layout=layout,
                dtype=dtype,
            )
        body = separable_conv_block(
            body,
            "separable_conv_block_12",
            int(512 * alpha),
            int(1024 * alpha),
            downsample=True,
            layout=layout,
            dtype=dtype,
        )
        body = separable_conv_block(
            body,
            "separable_conv_block_13",
            int(1024 * alpha),
            int(1024 * alpha),
            layout=layout,
            dtype=dtype,
        )
    pool = relay.nn.global_avg_pool2d(data=body, layout=layout)
    flatten = relay.nn.batch_flatten(data=pool)
    weight = relay.var("fc_weight")
    bias = relay.var("fc_bias")
    fc = relay.nn.dense(data=flatten, weight=weight, units=num_classes)
    fc = relay.nn.bias_add(fc, bias)
    softmax = relay.nn.softmax(data=fc)
    return relay.Function(relay.analysis.free_vars(softmax), softmax)

代码结构挺像pytorch构建网络的样子,首先注意到里面有两个封装好的函数,分别是conv_blockseparable_conv_block
这两个封装好的函数放在layer.py文件当中,除了他俩意外,会明显的看到有relay.nn.xxxx,类似pytorch.nn.xxxx的方法来添加算子。
relay除了支持在神经网络中构建用于添加算子完成构图,还有很多其它的作用,详细可以参考官方文档 relay官方文档
看看conv_block中的实现如下:

def conv_block(
    data,
    name,
    channels,
    kernel_size=(3, 3),
    strides=(1, 1),
    padding=(1, 1),
    epsilon=1e-5,
    layout="NCHW",
):
    """Helper function to construct conv_bn-relu"""
    # convolution + bn + relu
    conv = layers.conv2d(
        data=data,
        channels=channels,
        kernel_size=kernel_size,
        strides=strides,
        padding=padding,
        data_layout=layout,
        kernel_layout=layers.conv_kernel_layout(layout),
        name=name + "_conv",
    )
    bn = layers.batch_norm_infer(data=conv, epsilon=epsilon, name=name + "_bn")
    act = relay.nn.relu(data=bn)
    return act

一个conv_blockconv2d + bn + relu三者的集合,conv和bn的实现是在layer.py当中,再跳到layer.py中得到conv2d和bn的实现如下:

def conv2d(data, weight=None, **kwargs):
    """Wrapper of conv2d which automatically creates weights if not given.

    Parameters
    ----------
    data : relay.Expr
        The input expression.

    weight : relay.Expr
        The weight to conv2d.

    kwargs : dict
        Additional arguments.

    Returns
    -------
    result : relay.Expr
        The result.
    """
    name = kwargs.get("name")
    kwargs.pop("name")
    if not weight:
        weight = relay.var(name + "_weight")
    return relay.nn.conv2d(data, weight, **kwargs)

def batch_norm_infer(data, gamma=None, beta=None, moving_mean=None, moving_var=None, **kwargs):
    """Wrapper of batch_norm.

    This function automatically creates weights and return
    the first output(normalized result).

    Parameters
    ----------
    data : relay.Expr
        The input expression.

    gamma : relay.Expr
        The gamma scale factor.

    beta : relay.Expr
        The beta offset factor.

    moving_mean : relay.Expr
        Running mean of input,

    moving_var : relay.Expr
        Running variance of input.

    kwargs : dict
        Additional arguments.

    Returns
    -------
    result : relay.Expr
        The result.
    """
    name = kwargs.get("name")
    kwargs.pop("name")
    if not gamma:
        gamma = relay.var(name + "_gamma")
    if not beta:
        beta = relay.var(name + "_beta")
    if not moving_mean:
        moving_mean = relay.var(name + "_moving_mean")
    if not moving_var:
        moving_var = relay.var(name + "_moving_var")
    return relay.nn.batch_norm(
        data, gamma=gamma, beta=beta, moving_mean=moving_mean, moving_var=moving_var, **kwargs
    )[0]

relay.nn.var类似tensorflow 中的placeholder 用于传入数据;
relay.nn.global_avg_pool2d 就是新建一个avgooling 算子
relay.nn.batch_flatten 把输入数据除了batch这个维度意外全都展平
relay.nn.dense 一个矩阵乘法的操作即 Y = X W T Y = XW^T Y=XWT
relay.nn.bias_add 在某一个维度上加上bias
relay.nn.softmax 就是softmax
relay.nn.batch_norm bn层
relay.nn.conv2d conv2d

最后一条代码为

return relay.Function(relay.analysis.free_vars(softmax), softmax)

返回的是一个relay.Function和一个relay.analysis.free_vars(softmax)
对于relay.analysis.free_vars的解释是:

Get free Vars from expression expr in Post DFS order.

以后序dfs的遍历方式从表达式中获取free vars,个人判断就是使用后序dfs的方式找到所有relay.nn.var,即找到网络图中所有输入数据的入口,比如网络图中需要提供权重和输入数据的部分。

关于relay.analysis的解释为:

The Relay IR namespace containing the analysis passes.

个人理解,analysis和编译里面的静态分析是一个意思,提供了分析IR的一系列工具。

关于relay.Function 以及最后为什么要返回Function这样的类型?
官方文档中的解释是:

Function(params, body[, ret_type, …]) A function declaration expression.

是一个函数声明的表达式类,为什么返回这样的类型?目前不太清楚,应该是使用Function来表示一组Op所构成集合
使用上面的Function类型,简单看看源码里面接收的参数类型和实现了什么接口:
在文件function.py中如下:

@tvm._ffi.register_object("relay.Function")
class Function(BaseFunc):
    """A function declaration expression.

    Parameters
    ----------
    params: List[tvm.relay.Var]
        List of input parameters to the function.

    body: tvm.relay.Expr
        The body of the function.

    ret_type: Optional[tvm.relay.Type]
        The return type annotation of the function.

    type_params: Optional[List[tvm.relay.TypeParam]]
        The additional type parameters, this is only
        used in advanced usecase of template functions.
    """

    def __init__(self, params, body, ret_type=None, type_params=None, attrs=None):
        if type_params is None:
            type_params = convert([])

        self.__init_handle_by_constructor__(
            _ffi_api.Function, params, body, ret_type, type_params, attrs
        )

    def __call__(self, *args):
        """Invoke the global function.

        Parameters
        ----------
        args: List[relay.Expr]
            Arguments.
        """
        return Call(self, args, None, None)

可以看到主要有两个参数,分别是paramsbody,两个参数的类型分别是List[tvm.relay.Var]tvm.relay.Expr
回过头在看mobilenet网络模型那部分的返回值

return relay.Function(relay.analysis.free_vars(softmax), softmax)

可以理解成,整个神经网络是一个图接口,这个图接口被包装成了一个Function,这个图有入口节点和输出节点,对应到函数中,入口节点是vars,函数体(也就是body参数)就是神经网络图

上面返回的Function的第二个参数是softmax,是神经网络模型的最后一个节点,也是输出节点,这里用最后一个节点应该可以遍历到整个神经网络图(前面说的使用后续遍历的dfs嘛),暂时并没有阅读和relayexpression部分源码,直接print一下softmax试试,看看能否输出整个神经网络的图结构:

free_var %data: Tensor[(1, 3, 224, 224), float32];
free_var %conv_block_1_conv_weight;
%0 = nn.conv2d(%data, %conv_block_1_conv_weight, strides=[2, 2], padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3]);
free_var %conv_block_1_bn_gamma;
free_var %conv_block_1_bn_beta;
free_var %conv_block_1_bn_moving_mean;
free_var %conv_block_1_bn_moving_var;
%1 = nn.batch_norm(%0, %conv_block_1_bn_gamma, %conv_block_1_bn_beta, %conv_block_1_bn_moving_mean, %conv_block_1_bn_moving_var);
%2 = %1.0;
%3 = nn.relu(%2);
free_var %separable_conv_block_1_weight: Tensor[(32, 1, 3, 3), float32];
%4 = nn.conv2d(%3, %separable_conv_block_1_weight, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3]);
free_var %separable_conv_block_1_bn1_gamma;
free_var %separable_conv_block_1_bn1_beta;
free_var %separable_conv_block_1_bn1_moving_mean;
free_var %separable_conv_block_1_bn1_moving_var;
%5 = nn.batch_norm(%4, %separable_conv_block_1_bn1_gamma, %separable_conv_block_1_bn1_beta, %separable_conv_block_1_bn1_moving_mean, %separable_conv_block_1_bn1_moving_var);
%6 = %5.0;
%7 = nn.relu(%6);
free_var %separable_conv_block_1_conv2_weight;
%8 = nn.conv2d(%7, %separable_conv_block_1_conv2_weight, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1]);
free_var %separable_conv_block_1_bn2_gamma;
free_var %separable_conv_block_1_bn2_beta;
free_var %separable_conv_block_1_bn2_moving_mean;
free_var %separable_conv_block_1_bn2_moving_var;
%9 = nn.batch_norm(%8, %separable_conv_block_1_bn2_gamma, %separable_conv_block_1_bn2_beta, %separable_conv_block_1_bn2_moving_mean, %separable_conv_block_1_bn2_moving_var);
%10 = %9.0;
.........
略去一大块
.........
free_var %fc_weight;
%110 = nn.dense(%109, %fc_weight, units=10);
free_var %fc_bias;
%111 = nn.bias_add(%110, %fc_bias);
nn.softmax(%111)

仅输出softmax可以得到以上的一大串,所以应该可以通过最后一个节点索引到整个网络中所有与其相连的节点(根据llvm中的use-def chain 的功能猜测会有这样的功能,猜测也是阅读代码一种方式,哈哈)

另外,从上面可以看到有不少free_var标签的节点,这些就是要给神经网络图提供的输入数据。
Function输出一下看看

fn (%data: Tensor[(1, 3, 224, 224), float32], %conv_block_1_conv_weight, %conv_block_1_bn_gamma, %conv_block_1_bn_beta, %conv_block_1_bn_moving_mean, %conv_block_1_bn_moving_var, %separable_conv_block_1_weight: Tensor[(32, 1, 3, 3), float32], %separable_conv_block_1_bn1_gamma, %separable_conv_block_1_bn1_beta, %separable_conv_block_1_bn1_moving_mean, %separable_conv_block_1_bn1_moving_var, %separable_conv_block_1_conv2_weight, %separable_conv_block_1_bn2_gamma, %separable_conv_block_1_bn2_beta, %separable_conv_block_1_bn2_moving_mean, %separable_conv_block_1_bn2_moving_var, %separable_conv_block_2_weight: Tensor[(64, 1, 3, 3), float32], %separable_conv_block_2_bn1_gamma, %separable_conv_block_2_bn1_beta, %separable_conv_block_2_bn1_moving_mean, %separable_conv_block_2_bn1_moving_var, %separable_conv_block_2_conv2_weight, %separable_conv_block_2_bn2_gamma, %separable_conv_block_2_bn2_beta, %separable_conv_block_2_bn2_moving_mean, %separable_conv_block_2_bn2_moving_var, %separable_conv_block_3_weight: Tensor[(128, 1, 3, 3), float32], %separable_conv_block_3_bn1_gamma, %separable_conv_block_3_bn1_beta, %separable_conv_block_3_bn1_moving_mean, %separable_conv_block_3_bn1_moving_var, %separable_conv_block_3_conv2_weight, %separable_conv_block_3_bn2_gamma, %separable_conv_block_3_bn2_beta, %separable_conv_block_3_bn2_moving_mean, %separable_conv_block_3_bn2_moving_var, %separable_conv_block_4_weight: Tensor[(128, 1, 3, 3), float32], %separable_conv_block_4_bn1_gamma, %separable_conv_block_4_bn1_beta, %separable_conv_block_4_bn1_moving_mean, %separable_conv_block_4_bn1_moving_var, %separable_conv_block_4_conv2_weight, %separable_conv_block_4_bn2_gamma, %separable_conv_block_4_bn2_beta, %separable_conv_block_4_bn2_moving_mean, %separable_conv_block_4_bn2_moving_var, %separable_conv_block_5_weight: Tensor[(256, 1, 3, 3), float32], %separable_conv_block_5_bn1_gamma, %separable_conv_block_5_bn1_beta, %separable_conv_block_5_bn1_moving_mean, %separable_conv_block_5_bn1_moving_var, %separable_conv_block_5_conv2_weight, %separable_conv_block_5_bn2_gamma, %separable_conv_block_5_bn2_beta, %separable_conv_block_5_bn2_moving_mean, %separable_conv_block_5_bn2_moving_var, %separable_conv_block_6_weight: Tensor[(256, 1, 3, 3), float32], %separable_conv_block_6_bn1_gamma, %separable_conv_block_6_bn1_beta, %separable_conv_block_6_bn1_moving_mean, %separable_conv_block_6_bn1_moving_var, %separable_conv_block_6_conv2_weight, %separable_conv_block_6_bn2_gamma, %separable_conv_block_6_bn2_beta, %separable_conv_block_6_bn2_moving_mean, %separable_conv_block_6_bn2_moving_var, %separable_conv_block_7_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_7_bn1_gamma, %separable_conv_block_7_bn1_beta, %separable_conv_block_7_bn1_moving_mean, %separable_conv_block_7_bn1_moving_var, %separable_conv_block_7_conv2_weight, %separable_conv_block_7_bn2_gamma, %separable_conv_block_7_bn2_beta, %separable_conv_block_7_bn2_moving_mean, %separable_conv_block_7_bn2_moving_var, %separable_conv_block_8_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_8_bn1_gamma, %separable_conv_block_8_bn1_beta, %separable_conv_block_8_bn1_moving_mean, %separable_conv_block_8_bn1_moving_var, %separable_conv_block_8_conv2_weight, %separable_conv_block_8_bn2_gamma, %separable_conv_block_8_bn2_beta, %separable_conv_block_8_bn2_moving_mean, %separable_conv_block_8_bn2_moving_var, %separable_conv_block_9_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_9_bn1_gamma, %separable_conv_block_9_bn1_beta, %separable_conv_block_9_bn1_moving_mean, %separable_conv_block_9_bn1_moving_var, %separable_conv_block_9_conv2_weight, %separable_conv_block_9_bn2_gamma, %separable_conv_block_9_bn2_beta, %separable_conv_block_9_bn2_moving_mean, %separable_conv_block_9_bn2_moving_var, %separable_conv_block_10_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_10_bn1_gamma, %separable_conv_block_10_bn1_beta, %separable_conv_block_10_bn1_moving_mean, %separable_conv_block_10_bn1_moving_var, %separable_conv_block_10_conv2_weight, %separable_conv_block_10_bn2_gamma, %separable_conv_block_10_bn2_beta, %separable_conv_block_10_bn2_moving_mean, %separable_conv_block_10_bn2_moving_var, %separable_conv_block_11_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_11_bn1_gamma, %separable_conv_block_11_bn1_beta, %separable_conv_block_11_bn1_moving_mean, %separable_conv_block_11_bn1_moving_var, %separable_conv_block_11_conv2_weight, %separable_conv_block_11_bn2_gamma, %separable_conv_block_11_bn2_beta, %separable_conv_block_11_bn2_moving_mean, %separable_conv_block_11_bn2_moving_var, %separable_conv_block_12_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_12_bn1_gamma, %separable_conv_block_12_bn1_beta, %separable_conv_block_12_bn1_moving_mean, %separable_conv_block_12_bn1_moving_var, %separable_conv_block_12_conv2_weight, %separable_conv_block_12_bn2_gamma, %separable_conv_block_12_bn2_beta, %separable_conv_block_12_bn2_moving_mean, %separable_conv_block_12_bn2_moving_var, %separable_conv_block_13_weight: Tensor[(1024, 1, 3, 3), float32], %separable_conv_block_13_bn1_gamma, %separable_conv_block_13_bn1_beta, %separable_conv_block_13_bn1_moving_mean, %separable_conv_block_13_bn1_moving_var, %separable_conv_block_13_conv2_weight, %separable_conv_block_13_bn2_gamma, %separable_conv_block_13_bn2_beta, %separable_conv_block_13_bn2_moving_mean, %separable_conv_block_13_bn2_moving_var, %fc_weight, %fc_bias) {
  %0 = nn.conv2d(%data, %conv_block_1_conv_weight, strides=[2, 2], padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3]);
  %1 = nn.batch_norm(%0, %conv_block_1_bn_gamma, %conv_block_1_bn_beta, %conv_block_1_bn_moving_mean, %conv_block_1_bn_moving_var);
  %2 = %1.0;
  %3 = nn.relu(%2);
  %4 = nn.conv2d(%3, %separable_conv_block_1_weight, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3]);
  %5 = nn.batch_norm(%4, %separable_conv_block_1_bn1_gamma, %separable_conv_block_1_bn1_beta, %separable_conv_block_1_bn1_moving_mean, %separable_conv_block_1_bn1_moving_var);
  %6 = %5.0;
  %7 = nn.relu(%6);
  %8 = nn.conv2d(%7, %separable_conv_block_1_conv2_weight, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1]);
  %9 = nn.batch_norm(%8, %separable_conv_block_1_bn2_gamma, %separable_conv_block_1_bn2_beta, %separable_conv_block_1_bn2_moving_mean, %separable_conv_block_1_bn2_moving_var);
  %10 = %9.0;
  %11 = nn.relu(%10);
  %12 = nn.conv2d(%11, %separable_conv_block_2_weight, strides=[2, 2], padding=[1, 1, 1, 1], groups=64, channels=64, kernel_size=[3, 3]);
  %13 = nn.batch_norm(%12, %separable_conv_block_2_bn1_gamma, %separable_conv_block_2_bn1_beta, %separable_conv_block_2_bn1_moving_mean, %separable_conv_block_2_bn1_moving_var);
  %14 = %13.0;
 	........
 	略去一大块
 	........
  %107 = nn.relu(%106);
  %108 = nn.global_avg_pool2d(%107);
  %109 = nn.batch_flatten(%108);
  %110 = nn.dense(%109, %fc_weight, units=10);
  %111 = nn.bias_add(%110, %fc_bias);
  nn.softmax(%111)
}

和上面瞎猜的差不多,Function把各种expression形式的Op的组合,转换成了一个函数,函数的表达形式是一种IR(按照编译的思想去理解,expression也是一种IR,两种IR所能表达的能力不同)网络模型中所有需要输入的部分(如输入信息和参数)全部以函数参数的形式出现在fn当中

返回的relay.Function类型在get_workload函数中传递给create_workload,其中create_workload函数在init.py文件中定义,内容如下:

 def create_workload(net, initializer=None, seed=0):
     """Helper function to create benchmark image classification workload.

     Parameters
     ----------
     net : tvm.relay.Function
         The selected function of the network.

     initializer : Initializer
         The initializer used

     seed : int
         The seed used in initialization.

     Returns
     -------
     mod : tvm.IRModule
         The created relay module.

     params : dict of str to NDArray
         The parameters.
     """
     mod = tvm.IRModule.from_expr(net)
     mod = relay.transform.InferType()(mod)
     shape_dict = {v.name_hint: v.checked_type for v in mod["main"].params}
     np.random.seed(seed)
     initializer = initializer if initializer else Xavier()
     params = {}
     for k, v in shape_dict.items():
         if k == "data":
             continue
         init_value = np.zeros(v.concrete_shape).astype(v.dtype)
         initializer(k, init_value)
         params[k] = tvm.nd.array(init_value, device=tvm.cpu(0))
     return mod, params

乍一眼看上去,这个函数里面的功能好像是对网络模型中的参数进行初始化的,毕竟函数接口中有两个内容,一个是图结构net,另一个参数为初始化器initializer。

传递传递进来的网络图是以Function的形式传递进来的,首先调用tvm.IRModule.from_expr(net)函数,这个函数是做什么的?

先看IRModule是什么,官网文档上给出的解释如下:

IRModule that holds functions and type definitions.
IRModule is the basic unit for all IR transformations across the stack.

文档给出的解释为,IRModule 包括所有functions和类型定义,而且是堆栈中IR转换的基本单位。

把IRModule的返回数据打印一下,看看什么样?

def @main(%data: Tensor[(1, 3, 224, 224), float32], %conv_block_1_conv_weight, %conv_block_1_bn_gamma, %conv_block_1_bn_beta, %conv_block_1_bn_moving_mean, %conv_block_1_bn_moving_var, %separable_conv_block_1_weight: Tensor[(32, 1, 3, 3), float32], %separable_conv_block_1_bn1_gamma, %separable_conv_block_1_bn1_beta, %separable_conv_block_1_bn1_moving_mean, %separable_conv_block_1_bn1_moving_var, %separable_conv_block_1_conv2_weight, %separable_conv_block_1_bn2_gamma, %separable_conv_block_1_bn2_beta, %separable_conv_block_1_bn2_moving_mean, %separable_conv_block_1_bn2_moving_var, %separable_conv_block_2_weight: Tensor[(64, 1, 3, 3), float32], %separable_conv_block_2_bn1_gamma, %separable_conv_block_2_bn1_beta, %separable_conv_block_2_bn1_moving_mean, %separable_conv_block_2_bn1_moving_var, %separable_conv_block_2_conv2_weight, %separable_conv_block_2_bn2_gamma, %separable_conv_block_2_bn2_beta, %separable_conv_block_2_bn2_moving_mean, %separable_conv_block_2_bn2_moving_var, %separable_conv_block_3_weight: Tensor[(128, 1, 3, 3), float32], %separable_conv_block_3_bn1_gamma, %separable_conv_block_3_bn1_beta, %separable_conv_block_3_bn1_moving_mean, %separable_conv_block_3_bn1_moving_var, %separable_conv_block_3_conv2_weight, %separable_conv_block_3_bn2_gamma, %separable_conv_block_3_bn2_beta, %separable_conv_block_3_bn2_moving_mean, %separable_conv_block_3_bn2_moving_var, %separable_conv_block_4_weight: Tensor[(128, 1, 3, 3), float32], %separable_conv_block_4_bn1_gamma, %separable_conv_block_4_bn1_beta, %separable_conv_block_4_bn1_moving_mean, %separable_conv_block_4_bn1_moving_var, %separable_conv_block_4_conv2_weight, %separable_conv_block_4_bn2_gamma, %separable_conv_block_4_bn2_beta, %separable_conv_block_4_bn2_moving_mean, %separable_conv_block_4_bn2_moving_var, %separable_conv_block_5_weight: Tensor[(256, 1, 3, 3), float32], %separable_conv_block_5_bn1_gamma, %separable_conv_block_5_bn1_beta, %separable_conv_block_5_bn1_moving_mean, %separable_conv_block_5_bn1_moving_var, %separable_conv_block_5_conv2_weight, %separable_conv_block_5_bn2_gamma, %separable_conv_block_5_bn2_beta, %separable_conv_block_5_bn2_moving_mean, %separable_conv_block_5_bn2_moving_var, %separable_conv_block_6_weight: Tensor[(256, 1, 3, 3), float32], %separable_conv_block_6_bn1_gamma, %separable_conv_block_6_bn1_beta, %separable_conv_block_6_bn1_moving_mean, %separable_conv_block_6_bn1_moving_var, %separable_conv_block_6_conv2_weight, %separable_conv_block_6_bn2_gamma, %separable_conv_block_6_bn2_beta, %separable_conv_block_6_bn2_moving_mean, %separable_conv_block_6_bn2_moving_var, %separable_conv_block_7_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_7_bn1_gamma, %separable_conv_block_7_bn1_beta, %separable_conv_block_7_bn1_moving_mean, %separable_conv_block_7_bn1_moving_var, %separable_conv_block_7_conv2_weight, %separable_conv_block_7_bn2_gamma, %separable_conv_block_7_bn2_beta, %separable_conv_block_7_bn2_moving_mean, %separable_conv_block_7_bn2_moving_var, %separable_conv_block_8_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_8_bn1_gamma, %separable_conv_block_8_bn1_beta, %separable_conv_block_8_bn1_moving_mean, %separable_conv_block_8_bn1_moving_var, %separable_conv_block_8_conv2_weight, %separable_conv_block_8_bn2_gamma, %separable_conv_block_8_bn2_beta, %separable_conv_block_8_bn2_moving_mean, %separable_conv_block_8_bn2_moving_var, %separable_conv_block_9_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_9_bn1_gamma, %separable_conv_block_9_bn1_beta, %separable_conv_block_9_bn1_moving_mean, %separable_conv_block_9_bn1_moving_var, %separable_conv_block_9_conv2_weight, %separable_conv_block_9_bn2_gamma, %separable_conv_block_9_bn2_beta, %separable_conv_block_9_bn2_moving_mean, %separable_conv_block_9_bn2_moving_var, %separable_conv_block_10_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_10_bn1_gamma, %separable_conv_block_10_bn1_beta, %separable_conv_block_10_bn1_moving_mean, %separable_conv_block_10_bn1_moving_var, %separable_conv_block_10_conv2_weight, %separable_conv_block_10_bn2_gamma, %separable_conv_block_10_bn2_beta, %separable_conv_block_10_bn2_moving_mean, %separable_conv_block_10_bn2_moving_var, %separable_conv_block_11_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_11_bn1_gamma, %separable_conv_block_11_bn1_beta, %separable_conv_block_11_bn1_moving_mean, %separable_conv_block_11_bn1_moving_var, %separable_conv_block_11_conv2_weight, %separable_conv_block_11_bn2_gamma, %separable_conv_block_11_bn2_beta, %separable_conv_block_11_bn2_moving_mean, %separable_conv_block_11_bn2_moving_var, %separable_conv_block_12_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_12_bn1_gamma, %separable_conv_block_12_bn1_beta, %separable_conv_block_12_bn1_moving_mean, %separable_conv_block_12_bn1_moving_var, %separable_conv_block_12_conv2_weight, %separable_conv_block_12_bn2_gamma, %separable_conv_block_12_bn2_beta, %separable_conv_block_12_bn2_moving_mean, %separable_conv_block_12_bn2_moving_var, %separable_conv_block_13_weight: Tensor[(1024, 1, 3, 3), float32], %separable_conv_block_13_bn1_gamma, %separable_conv_block_13_bn1_beta, %separable_conv_block_13_bn1_moving_mean, %separable_conv_block_13_bn1_moving_var, %separable_conv_block_13_conv2_weight, %separable_conv_block_13_bn2_gamma, %separable_conv_block_13_bn2_beta, %separable_conv_block_13_bn2_moving_mean, %separable_conv_block_13_bn2_moving_var, %fc_weight, %fc_bias) {
  %0 = nn.conv2d(%data, %conv_block_1_conv_weight, strides=[2, 2], padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3]);
  %1 = nn.batch_norm(%0, %conv_block_1_bn_gamma, %conv_block_1_bn_beta, %conv_block_1_bn_moving_mean, %conv_block_1_bn_moving_var);
  %2 = %1.0;
  %3 = nn.relu(%2);
  %4 = nn.conv2d(%3, %separable_conv_block_1_weight, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3]);
  %5 = nn.batch_norm(%4, %separable_conv_block_1_bn1_gamma, %separable_conv_block_1_bn1_beta, %separable_conv_block_1_bn1_moving_mean, %separable_conv_block_1_bn1_moving_var);
  %6 = %5.0;
  %7 = nn.relu(%6);
  %8 = nn.conv2d(%7, %separable_conv_block_1_conv2_weight, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1]);
  %9 = nn.batch_norm(%8, %separable_conv_block_1_bn2_gamma, %separable_conv_block_1_bn2_beta, %separable_conv_block_1_bn2_moving_mean, %separable_conv_block_1_bn2_moving_var);
  %10 = %9.0;
  %11 = nn.relu(%10);
  %12 = nn.conv2d(%11, %separable_conv_block_2_weight, strides=[2, 2], padding=[1, 1, 1, 1], groups=64, channels=64, kernel_size=[3, 3]);
  %13 = nn.batch_norm(%12, %separable_conv_block_2_bn1_gamma, %separable_conv_block_2_bn1_beta, %separable_conv_block_2_bn1_moving_mean, %separable_conv_block_2_bn1_moving_var);
  %14 = %13.0;
..........
省略一大段
...........
  %105 = nn.batch_norm(%104, %separable_conv_block_13_bn2_gamma, %separable_conv_block_13_bn2_beta, %separable_conv_block_13_bn2_moving_mean, %separable_conv_block_13_bn2_moving_var);
  %106 = %105.0;
  %107 = nn.relu(%106);
  %108 = nn.global_avg_pool2d(%107);
  %109 = nn.batch_flatten(%108);
  %110 = nn.dense(%109, %fc_weight, units=10);
  %111 = nn.bias_add(%110, %fc_bias);
  nn.softmax(%111)
}

只有一个明显的区别,就是Function输出的图表示中第一行开始是fn,而经过IRModule以后得到的图表示第一行是def @main

接下来会调用relay.transform.InferType()(mod)这个函数并且好像会同样返回IR,从函数含义上来看应该是做类型推断的,官方文档解释如下:

Infer the type of an expr.

经过类型推导以后的表达式,而且返回值一个tvm.transform.Pass,也就是说,会将上面的得到的IRModule的图结构传递到这个pass里面进行类型推断,得到的自然就是经过这个pass更新之后的IR,打印出来看看有什么区别。

def @main(%data: Tensor[(1, 3, 224, 224), float32], %conv_block_1_conv_weight: Tensor[(32, 3, 3, 3), float32], %conv_block_1_bn_gamma: Tensor[(32), float32], %conv_block_1_bn_beta: Tensor[(32), float32], %conv_block_1_bn_moving_mean: Tensor[(32), float32], %conv_block_1_bn_moving_var: Tensor[(32), float32], %separable_conv_block_1_weight: Tensor[(32, 1, 3, 3), float32], %separable_conv_block_1_bn1_gamma: Tensor[(32), float32], %separable_conv_block_1_bn1_beta: Tensor[(32), float32], %separable_conv_block_1_bn1_moving_mean: Tensor[(32), float32], %separable_conv_block_1_bn1_moving_var: Tensor[(32), float32], %separable_conv_block_1_conv2_weight: Tensor[(64, 32, 1, 1), float32], %separable_conv_block_1_bn2_gamma: Tensor[(64), float32], %separable_conv_block_1_bn2_beta: Tensor[(64), float32], %separable_conv_block_1_bn2_moving_mean: Tensor[(64), float32], %separable_conv_block_1_bn2_moving_var: Tensor[(64), float32], %separable_conv_block_2_weight: Tensor[(64, 1, 3, 3), float32], %separable_conv_block_2_bn1_gamma: Tensor[(64), float32], %separable_conv_block_2_bn1_beta: Tensor[(64), float32], %separable_conv_block_2_bn1_moving_mean: Tensor[(64), float32], %separable_conv_block_2_bn1_moving_var: Tensor[(64), float32], %separable_conv_block_2_conv2_weight: Tensor[(128, 64, 1, 1), float32], %separable_conv_block_2_bn2_gamma: Tensor[(128), float32], %separable_conv_block_2_bn2_beta: Tensor[(128), float32], %separable_conv_block_2_bn2_moving_mean: Tensor[(128), float32], %separable_conv_block_2_bn2_moving_var: Tensor[(128), float32], %separable_conv_block_3_weight: Tensor[(128, 1, 3, 3), float32], %separable_conv_block_3_bn1_gamma: Tensor[(128), float32], %separable_conv_block_3_bn1_beta: Tensor[(128), float32], %separable_conv_block_3_bn1_moving_mean: Tensor[(128), float32], %separable_conv_block_3_bn1_moving_var: Tensor[(128), float32], %separable_conv_block_3_conv2_weight: Tensor[(128, 128, 1, 1), float32], %separable_conv_block_3_bn2_gamma: Tensor[(128), float32], %separable_conv_block_3_bn2_beta: Tensor[(128), float32], %separable_conv_block_3_bn2_moving_mean: Tensor[(128), float32], %separable_conv_block_3_bn2_moving_var: Tensor[(128), float32], %separable_conv_block_4_weight: Tensor[(128, 1, 3, 3), float32], %separable_conv_block_4_bn1_gamma: Tensor[(128), float32], %separable_conv_block_4_bn1_beta: Tensor[(128), float32], %separable_conv_block_4_bn1_moving_mean: Tensor[(128), float32], %separable_conv_block_4_bn1_moving_var: Tensor[(128), float32], %separable_conv_block_4_conv2_weight: Tensor[(256, 128, 1, 1), float32], %separable_conv_block_4_bn2_gamma: Tensor[(256), float32], %separable_conv_block_4_bn2_beta: Tensor[(256), float32], %separable_conv_block_4_bn2_moving_mean: Tensor[(256), float32], %separable_conv_block_4_bn2_moving_var: Tensor[(256), float32], %separable_conv_block_5_weight: Tensor[(256, 1, 3, 3), float32], %separable_conv_block_5_bn1_gamma: Tensor[(256), float32], %separable_conv_block_5_bn1_beta: Tensor[(256), float32], %separable_conv_block_5_bn1_moving_mean: Tensor[(256), float32], %separable_conv_block_5_bn1_moving_var: Tensor[(256), float32], %separable_conv_block_5_conv2_weight: Tensor[(256, 256, 1, 1), float32], %separable_conv_block_5_bn2_gamma: Tensor[(256), float32], %separable_conv_block_5_bn2_beta: Tensor[(256), float32], %separable_conv_block_5_bn2_moving_mean: Tensor[(256), float32], %separable_conv_block_5_bn2_moving_var: Tensor[(256), float32], %separable_conv_block_6_weight: Tensor[(256, 1, 3, 3), float32], %separable_conv_block_6_bn1_gamma: Tensor[(256), float32], %separable_conv_block_6_bn1_beta: Tensor[(256), float32], %separable_conv_block_6_bn1_moving_mean: Tensor[(256), float32], %separable_conv_block_6_bn1_moving_var: Tensor[(256), float32], %separable_conv_block_6_conv2_weight: Tensor[(512, 256, 1, 1), float32], %separable_conv_block_6_bn2_gamma: Tensor[(512), float32], %separable_conv_block_6_bn2_beta: Tensor[(512), float32], %separable_conv_block_6_bn2_moving_mean: Tensor[(512), float32], %separable_conv_block_6_bn2_moving_var: Tensor[(512), float32], %separable_conv_block_7_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_7_bn1_gamma: Tensor[(512), float32], %separable_conv_block_7_bn1_beta: Tensor[(512), float32], %separable_conv_block_7_bn1_moving_mean: Tensor[(512), float32], %separable_conv_block_7_bn1_moving_var: Tensor[(512), float32], %separable_conv_block_7_conv2_weight: Tensor[(512, 512, 1, 1), float32], %separable_conv_block_7_bn2_gamma: Tensor[(512), float32], %separable_conv_block_7_bn2_beta: Tensor[(512), float32], %separable_conv_block_7_bn2_moving_mean: Tensor[(512), float32], %separable_conv_block_7_bn2_moving_var: Tensor[(512), float32], %separable_conv_block_8_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_8_bn1_gamma: Tensor[(512), float32], %separable_conv_block_8_bn1_beta: Tensor[(512), float32], %separable_conv_block_8_bn1_moving_mean: Tensor[(512), float32], %separable_conv_block_8_bn1_moving_var: Tensor[(512), float32], %separable_conv_block_8_conv2_weight: Tensor[(512, 512, 1, 1), float32], %separable_conv_block_8_bn2_gamma: Tensor[(512), float32], %separable_conv_block_8_bn2_beta: Tensor[(512), float32], %separable_conv_block_8_bn2_moving_mean: Tensor[(512), float32], %separable_conv_block_8_bn2_moving_var: Tensor[(512), float32], %separable_conv_block_9_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_9_bn1_gamma: Tensor[(512), float32], %separable_conv_block_9_bn1_beta: Tensor[(512), float32], %separable_conv_block_9_bn1_moving_mean: Tensor[(512), float32], %separable_conv_block_9_bn1_moving_var: Tensor[(512), float32], %separable_conv_block_9_conv2_weight: Tensor[(512, 512, 1, 1), float32], %separable_conv_block_9_bn2_gamma: Tensor[(512), float32], %separable_conv_block_9_bn2_beta: Tensor[(512), float32], %separable_conv_block_9_bn2_moving_mean: Tensor[(512), float32], %separable_conv_block_9_bn2_moving_var: Tensor[(512), float32], %separable_conv_block_10_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_10_bn1_gamma: Tensor[(512), float32], %separable_conv_block_10_bn1_beta: Tensor[(512), float32], %separable_conv_block_10_bn1_moving_mean: Tensor[(512), float32], %separable_conv_block_10_bn1_moving_var: Tensor[(512), float32], %separable_conv_block_10_conv2_weight: Tensor[(512, 512, 1, 1), float32], %separable_conv_block_10_bn2_gamma: Tensor[(512), float32], %separable_conv_block_10_bn2_beta: Tensor[(512), float32], %separable_conv_block_10_bn2_moving_mean: Tensor[(512), float32], %separable_conv_block_10_bn2_moving_var: Tensor[(512), float32], %separable_conv_block_11_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_11_bn1_gamma: Tensor[(512), float32], %separable_conv_block_11_bn1_beta: Tensor[(512), float32], %separable_conv_block_11_bn1_moving_mean: Tensor[(512), float32], %separable_conv_block_11_bn1_moving_var: Tensor[(512), float32], %separable_conv_block_11_conv2_weight: Tensor[(512, 512, 1, 1), float32], %separable_conv_block_11_bn2_gamma: Tensor[(512), float32], %separable_conv_block_11_bn2_beta: Tensor[(512), float32], %separable_conv_block_11_bn2_moving_mean: Tensor[(512), float32], %separable_conv_block_11_bn2_moving_var: Tensor[(512), float32], %separable_conv_block_12_weight: Tensor[(512, 1, 3, 3), float32], %separable_conv_block_12_bn1_gamma: Tensor[(512), float32], %separable_conv_block_12_bn1_beta: Tensor[(512), float32], %separable_conv_block_12_bn1_moving_mean: Tensor[(512), float32], %separable_conv_block_12_bn1_moving_var: Tensor[(512), float32], %separable_conv_block_12_conv2_weight: Tensor[(1024, 512, 1, 1), float32], %separable_conv_block_12_bn2_gamma: Tensor[(1024), float32], %separable_conv_block_12_bn2_beta: Tensor[(1024), float32], %separable_conv_block_12_bn2_moving_mean: Tensor[(1024), float32], %separable_conv_block_12_bn2_moving_var: Tensor[(1024), float32], %separable_conv_block_13_weight: Tensor[(1024, 1, 3, 3), float32], %separable_conv_block_13_bn1_gamma: Tensor[(1024), float32], %separable_conv_block_13_bn1_beta: Tensor[(1024), float32], %separable_conv_block_13_bn1_moving_mean: Tensor[(1024), float32], %separable_conv_block_13_bn1_moving_var: Tensor[(1024), float32], %separable_conv_block_13_conv2_weight: Tensor[(1024, 1024, 1, 1), float32], %separable_conv_block_13_bn2_gamma: Tensor[(1024), float32], %separable_conv_block_13_bn2_beta: Tensor[(1024), float32], %separable_conv_block_13_bn2_moving_mean: Tensor[(1024), float32], %separable_conv_block_13_bn2_moving_var: Tensor[(1024), float32], %fc_weight: Tensor[(10, 1024), float32], %fc_bias: Tensor[(10), float32]) -> Tensor[(1, 10), float32] {
  %0 = nn.conv2d(%data, %conv_block_1_conv_weight, strides=[2, 2], padding=[1, 1, 1, 1], channels=32, kernel_size=[3, 3]) /* ty=Tensor[(1, 32, 112, 112), float32] */;
  %1 = nn.batch_norm(%0, %conv_block_1_bn_gamma, %conv_block_1_bn_beta, %conv_block_1_bn_moving_mean, %conv_block_1_bn_moving_var) /* ty=(Tensor[(1, 32, 112, 112), float32], Tensor[(32), float32], Tensor[(32), float32]) */;
  %2 = %1.0;
  %3 = nn.relu(%2) /* ty=Tensor[(1, 32, 112, 112), float32] */;
  %4 = nn.conv2d(%3, %separable_conv_block_1_weight, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3]) /* ty=Tensor[(1, 32, 112, 112), float32] */;
  %5 = nn.batch_norm(%4, %separable_conv_block_1_bn1_gamma, %separable_conv_block_1_bn1_beta, %separable_conv_block_1_bn1_moving_mean, %separable_conv_block_1_bn1_moving_var) /* ty=(Tensor[(1, 32, 112, 112), float32], Tensor[(32), float32], Tensor[(32), float32]) */;
  %6 = %5.0;
  %7 = nn.relu(%6) /* ty=Tensor[(1, 32, 112, 112), float32] */;
  %8 = nn.conv2d(%7, %separable_conv_block_1_conv2_weight, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1]) /* ty=Tensor[(1, 64, 112, 112), float32] */;
  %9 = nn.batch_norm(%8, %separable_conv_block_1_bn2_gamma, %separable_conv_block_1_bn2_beta, %separable_conv_block_1_bn2_moving_mean, %separable_conv_block_1_bn2_moving_var) /* ty=(Tensor[(1, 64, 112, 112), float32], Tensor[(64), float32], Tensor[(64), float32]) */;
  %10 = %9.0;
  %11 = nn.relu(%10) /* ty=Tensor[(1, 64, 112, 112), float32] */;
  %12 = nn.conv2d(%11, %separable_conv_block_2_weight, strides=[2, 2], padding=[1, 1, 1, 1], groups=64, channels=64, kernel_size=[3, 3]) /* ty=Tensor[(1, 64, 56, 56), float32] */;
  %13 = nn.batch_norm(%12, %separable_conv_block_2_bn1_gamma, %separable_conv_block_2_bn1_beta, %separable_conv_block_2_bn1_moving_mean, %separable_conv_block_2_bn1_moving_var) /* ty=(Tensor[(1, 64, 56, 56), float32], Tensor[(64), float32], Tensor[(64), float32]) */;
  %14 = %13.0;
...........
略过一大段
...........
  %107 = nn.relu(%106) /* ty=Tensor[(1, 1024, 7, 7), float32] */;
  %108 = nn.global_avg_pool2d(%107) /* ty=Tensor[(1, 1024, 1, 1), float32] */;
  %109 = nn.batch_flatten(%108) /* ty=Tensor[(1, 1024), float32] */;
  %110 = nn.dense(%109, %fc_weight, units=10) /* ty=Tensor[(1, 10), float32] */;
  %111 = nn.bias_add(%110, %fc_bias) /* ty=Tensor[(1, 10), float32] */;
  nn.softmax(%111) /* ty=Tensor[(1, 10), float32] */
}

对比上面的IR和大上面的IR,可以发现,传递进来的参数部分(比如BN层的参数gammar等)全部都有了参数类型float,而且还多了形状的信息,说明这个pass可以实现对数据类型的自动推导,也有可能有形状推导的功能,关于pass部分,后面再做深究。

继续看下一条语句

shape_dict = {v.name_hint: v.checked_type for v in mod["main"].params}

mod看起来是个字典类型,实际上在IRModule类中实现了__getitem__方法,此外很有很多函数,这里暂不详细讨论,后续再说,打印一下shape_dict看看里面是什么,根据名称猜测是装着所有参数形状的一个dict,如下:

{'data': TensorType([1, 3, 224, 224], float32), 'conv_block_1_conv_weight': TensorType([32, 3, 3, 3], float32), 'conv_block_1_bn_gamma': TensorType([32], float32),.....}

这里只放了一小部分,可以看到除了形状信息,还有类型信息的类容,key值也确实是网络模型待输入的参数。

后面的几行代码的语句就好理解了,拿到网络模型参数的形状,初始化参数,并返回。

编译网络图

opt_level = 3
target = "llvm"
target_host = "llvm"
with tvm.transform.PassContext(opt_level=opt_level):
    lib = relay.build(mod, target=target, target_host=target_host, params=params)

看看relay.build再官方文档中是如何定义的:

Helper function that builds a Relay function to run on TVM graph executor.

意思是用来运行后面runtime部分的一个辅助函数,实现部分再build_module.py文件当中,这里暂时不深入去看,先看看里面实现什么流程,有什么参数。

def build(ir_mod, target=None, target_host=None, params=None, mod_name="default"):
    """Helper function that builds a Relay function to run on TVM graph executor.

    Parameters
    ----------
    ir_mod : :py:class:`~tvm.IRModule`
        The IR module to build. Using relay.Function is deprecated.

    target : str, :any:`tvm.target.Target`, or dict of str(i.e. device/context name) to str/tvm.target.Target, optional
        For heterogeneous compilation, it is a dictionary indicating context to
        target mapping. For homogeneous compilation, it is a build target.

    target_host : str or :any:`tvm.target.Target`, optional
        Host compilation target, if target is device.
        When TVM compiles device specific program such as CUDA,
        we also need host(CPU) side code to interact with the driver
        setup the dimensions and parameters correctly.
        target_host is used to specify the host side codegen target.
        By default, llvm is used if it is enabled,
        otherwise a stackvm intepreter is used.

    params : dict of str to NDArray
        Input parameters to the graph that do not change
        during inference time. Used for constant folding.

    mod_name: Optional[str]
        The module name we will build

    Returns
    -------
    factory_module : tvm.relay.backend.executor_factory.ExecutorFactoryModule
            The runtime factory for the TVM graph executor.
    """

    if not isinstance(ir_mod, (IRModule, _function.Function)):
        raise ValueError("Type of input parameter mod must be tvm.IRModule")

    if isinstance(ir_mod, _function.Function):
        if params:
            ir_mod = bind_params_by_name(ir_mod, params)
        ir_mod = IRModule.from_expr(ir_mod)
        warnings.warn(
            "Please use input parameter mod (tvm.IRModule) "
            "instead of deprecated parameter mod (tvm.relay.function.Function)",
            DeprecationWarning,
        )
    target = build_target_by_device_type_map(target)
    if isinstance(target_host, (str, Target)):
        target_host = Target(target_host)
    elif target_host:
        raise ValueError("target host must be the type of str, " + "tvm.target.Target, or None")

    target, target_host = Target.check_and_update_host_consist(
        target, target_host, target_is_dict_key=False
    )

    # Retrieve the executor from the target
    executor = get_executor_from_target(target, target_host)

    # If current dispatch context is fallback context (the default root context),
    # then load pre-tuned parameters from TopHub
    if isinstance(autotvm.DispatchContext.current, autotvm.FallbackContext):
        tophub_context = autotvm.tophub.context(list(target.values()))
    else:
        tophub_context = autotvm.utils.EmptyContext()

    with tophub_context:
        bld_mod = BuildModule()
        executor_config, runtime_mod, params = bld_mod.build(
            mod=ir_mod, target=target, params=params, executor=executor, mod_name=mod_name
        )
        func_metadata = bld_mod.get_function_metadata()

        if executor == "aot":
            executor_factory = _executor_factory.AOTExecutorFactoryModule(
                ir_mod, target, runtime_mod, mod_name, params, func_metadata
            )
        elif executor == "graph":
            executor_factory = _executor_factory.GraphExecutorFactoryModule(
                ir_mod, target, executor_config, runtime_mod, mod_name, params, func_metadata
            )
        else:
            assert False, "Executor " + executor + " not supported"

        return executor_factory

首先会用isinstance判断输入的参数是否满足要求,根据不同的硬件设备设置设置不同的executor,包括aot模式和graph模式,(aot我知道是什么,但是graph执行模式是啥?以后再研究=_=)
函数参数很直白,返回值类型叫做executor_factory,从名字来看应该是一个工厂模式,如果是工厂模式,那么会有多种不同的构造方法,根据不同的executor构造executor_factory

总之最后返回的内容是依赖固定硬件设备,且优化好的指令序列,只要给上输入数据,就能跑了。这部分代码在源码中也是重中之重,后续需要深入阅读。

运行

# Run the generate library
# create random input
dev = tvm.cpu(0)
data = np.random.uniform(-1, 1, size=data_shape).astype("float32")
# create module
module = graph_executor.GraphModule(lib["default"](dev))
# set input and parameters
module.set_input("data", data)
# run
module.run()
# get output
out = module.get_output(0, tvm.nd.empty(out_shape)).numpy()

运行部分的代码逻辑相当只管,加载、设置输入、运行并得到结果,简单看一下GraphModulemodule.run里面都跑了些什么。

官方文档中对GraphModule的描述如下:

Wrapper runtime module.
This is a thin wrapper of the underlying TVM module. you can also directly call set_input, run, and get_output of underlying module functions

一个对模型的wrapper,可以直接调用提供底层方法,输入参数的要求如下:

module (tvm.runtime.Module) – The internal tvm module that holds the actual graph functions.

大概就是有一个wrapper对图模型进行了封装,这个wrapper支持持有graph并调用底层提供的方法用来实现runtime的一些功能。

在看run方法在官方文档中的解释

Run forward execution of the graph
Parameters
input_dict (dict of str to NDArray) – List of input values to be feed to

解释的很苍白,就是对图进行前传么,和tensorflow差不多(都是静态图)

保存图&加载

quick_start 的demo中代码的最后一部分是编译好的网络图结构和参数保存到磁盘上以及如何把磁盘上的图和参数加载进去

from tvm.contrib import utils

temp = utils.tempdir()
path_lib = temp.relpath("deploy_lib.tar")
lib.export_library(path_lib)
print(temp.listdir())

# load the module back.
# 加载网络并运行
loaded_lib = tvm.runtime.load_module(path_lib)
input_data = tvm.nd.array(data)

module = graph_executor.GraphModule(loaded_lib["default"](dev))
module.run(data=input_data)
out_deploy = module.get_output(0).numpy()

# Print first 10 elements of output
print(out_deploy.flatten()[0:10])

# check whether the output from deployed module is consistent with original one
tvm.testing.assert_allclose(out_deploy, out, atol=1e-5)

可以看到需要tvm.contrib中的util模块,首先建立一个文件名deploy_lib.tar然后把上面编译好的lib执行export进去即可

在加载离线的模型也很简单,调用tvm.runtime.load_module的方法,将模型路径写进去即可

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值