【TVM帮助文档学习】混合前端语言参考

zxros10

于 2022-04-11 14:23:24 发布

阅读量253

点赞数

分类专栏： TVM官方文档翻译文章标签：深度学习

原文链接：https://tvm.apache.org/docs/reference/langref/hybrid_script.html

版权

TVM官方文档翻译专栏收录该内容

29 篇文章 9 订阅

订阅专栏

本文翻译自Hybrid Frontend Language Reference — tvm 0.9.dev0 documentation

概述

这种混合前端允许用户编写一些尚未得到TVM官方支持的习惯用法的初步版本

特征

软件仿真

同时支持软件仿真和编译。在定义一个函数时，你需要使用tvm.te.hybrid.script装饰器来指示它是一个混合函数:

@tvm.te.hybrid.script
def outer_product(a, b):
    c = output_tensor((100, 99), 'float32')
    for i in range(a.shape[0]):
        for j in range(b.shape[0]):
            c[i, j] = a[i] * b[j]
    return c
a = numpy.random.randn(100)
b = numpy.random.randn(99)
c = outer_product(a, b)

这个装饰器将在软件仿真时自动导入需要的关键字。软件仿真完成后，导入的关键字将被清除。用户不需要担心关键字冲突和污染。

参数列表中传递给软件模拟的每个元素要么是python变量，要么是numpy数值类型。

后端编译

不建议使用此函数，建议用户使用第二个接口。当前的解析接口看起来像:

a = tvm.te.placeholder((100, ), name='a')
b = tvm.te.placeholder((99, ), name='b')
parser = tvm.hybrid.parse(outer_product, [a, b]) # return the parser of this function

如果我们传递这些tvm数据结构给函数，比如Tensor, Var, Expr. *Imm, 或者 tvm.container. Array,，它返回一个op节点:

a = tvm.te.placeholder((100, ), name='a')
b = tvm.te.placeholder((99, ), name='b')
c = outer_product(a, b) # return the output tensor(s) of the operator

你可以使用任何能应用在TVM OpNode上的方法，比如create_schedule，尽管到目前为止，schedule的功能和ExternOpNode一样有限。至少，它可以构建成LLVM模块。

调优

接上面的例子，你可以使用一些类似tvm的接口来调优代码:

i, j = c.op.axis
sch = te.create_schedule(op)
jo, ji = sch.split(j, 4)
sch.vectorize(ji)

现在可以使用循环注释(展开、并行、向量化和绑定)、循环操作(拆分和融合)和重新排序。

注意：这是一个初步的函数，因此用户应该负责调优后功能的正确性。具体来说，用户在融合和重新排序有瑕疵的循环时应该非常小心。

循环

在HalideIR中，循环有四种：串行（serial）、展开（unrolled）、并行（parallel）和向量化（vectorized）。

这里我们使用range，也就是serial，unroll，parallel，和vectorize，这四个关键字来注释for循环的相应类型。其用法与Python标准range大致相同。

除了Halide中支持的所有循环类型外，const_range还支持一些特定的形式。有时，tvm.container.Array需要作为参数传递，但在TVM-HalideIR中，不支持将tvm.container.Array转换为Expr。因此，只支持有限的特性。用户可以通过带注释的常量或常量循环来访问容器。

@tvm.te.hybrid.script
def foo(a, b): # b is a tvm.container.Array
    c = output_tensor(a.shape, a.dtype)
    for i in const_range(len(a)): # because you have b access, i should be explicitly annotated as const_range
        c[i] = a[i] + b[i]
    return c

变量

所有可变变量将降低为大小为1的数组。它将变量的第一个存储区视为变量的声明。

注意：与传统的Python不同，在混合脚本中，声明的变量只能在其声明的作用域级别使用。

注意：目前，只能使用基本类型的变量，即float32或int32。

for i in range(5):
    s = 0 # declaration, this s will be a 1-array in lowered IR
    for j in range(5):
      s += a[i, j] # do something with s
    b[i] = s # you can still use s in this level
a[0] = s # you CANNOT use s here, even though it is allowed in conventional Python

属性

到目前为止，只支持张量的shape和dtype属性!shape属性本质上是一个元组，所以必须作为数组访问它。目前，只支持常量索引访问。

x = a.shape[2] # OK!
for i in range(3):
   for j in a.shape[i]: # BAD! i is not a constant!
       # do something

条件语句和表达式

if condition1 and condition2 and condition3:
    # do something
else:
    # do something else
# Select
a = b if condition else c

然而，不支持关键字True和False。

基础数学函数

到目前为止，支持这些log, exp, sigmoid, tanh, power和popcount等基础数学函数。不需要import，直接使用即可

数组分配

正在开发中，稍后的版本将支持此功能!

使用函数allocation(shape, type, share/local)来声明一个数组缓存。基本用法与普通numpy.array大致相同，你应该以a[i, j, k]的方式访问高维度数组，而不是[i][j][k]，即使用于编译的tvm.container.Array。

线程绑定

你也可以这样写代码来做循环线程绑定:

for tx in bind("threadIdx.x", 100):
    a[tx] = b[tx]

Assert语句

支持Assert语句，你可以像在标准Python中那样简单地使用它。

assert cond, mesg

注意：断言不是函数调用。我们鼓励用户按照上述方式使用断言——条件后接消息。它适用于Python AST和HalideIR。

关键字

For循环关键字: serial, range, unroll, parallel, vectorize, bind, const_range
数学关键字: log, exp, sqrt, rsqrt, sigmoid, tanh, power, popcount, round, ceil_div
内存分配关键字: allocate, output_tensor
数据类型关键字：uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64
其他: max_num_threads