paddle动态图自定义算子（python版）

Vertira

已于 2022-10-01 10:49:38 修改

阅读量1k

点赞数

分类专栏： paddlepaddle 文章标签： paddle

于 2022-10-01 00:23:22 首次发布

本文链接：https://blog.csdn.net/Vertira/article/details/127130516

版权

paddlepaddle 专栏收录该内容

45 篇文章 19 订阅

订阅专栏

1.动态图自定义Python算子

先说一下辅助用的，这个接口：

Paddle 通过 PyLayer 接口和PyLayerContext接口支持动态图的Python端自定义OP。

自定义好的算子如何使用呢?需要特定的接口，这个接口就是PyLayer

PyLayer 接口描述如下：

class PyLayer:
    @staticmethod
    def forward(ctx, *args, **kwargs):
        pass

    @staticmethod
    def backward(ctx, *args, **kwargs):
        pass

    @classmethod
    def apply(cls, *args, **kwargs):
        pass

其中，

--forward 是自定义Op的前向函数，必须被子类重写，它的第一个参数是 PyLayerContext 对象，其他输入参数的类型和数量任意。

--backward 是自定义Op的反向函数，必须被子类重写，其第一个参数为 PyLayerContext 对象，其他输入参数为forward输出Tensor的梯度。它的输出Tensor为forward输入Tensor的梯度。

--apply 是自定义Op的执行方法，构建完自定义Op后，通过apply运行Op。

PyLayerContext 接口描述如下：

class PyLayerContext:
    def save_for_backward(self, *tensors):
        pass

    def saved_tensor(self):
        pass

其中，

save_for_backward 用于暂存backward需要的Tensor，这个API只能被调用一次，且只能在forward中调用。
saved_tensor 获取被save_for_backward暂存的Tensor。

=====================================

========重点开始了===========

动态图自定义

如何编写动态图Python Op¶

以下以tanh为例，介绍如何利用 PyLayer 编写Python Op。

第一步：创建PyLayer子类并定义前向函数和反向函数

前向函数和反向函数均由Python编写，可以方便地使用Paddle相关API来实现一个自定义的OP。需要遵守以下规则：

forward和backward都是静态函数，它们的第一个参数是PyLayerContext对象。
backward 除了第一个参数以外，其他参数都是forward函数的输出Tensor的梯度，因此，backward输入的Tensor的数量必须等于forward输出Tensor的数量。如果您需在backward中使用forward中的Tensor，您可以利用save_for_backward和saved_tensor这两个方法传递Tensor。
backward的输出可以是Tensor或者list/tuple(Tensor)，这些Tensor是forward输入Tensor的梯度。因此，backward的输出Tensor的个数等于forward输入Tensor的个数。如果backward的某个返回值（梯度）在forward中对应的Tensor的stop_gradient属性为False，这个返回值必须是Tensor类型。

两步搞定，就像写函数一样，先写方法，然后调用。这是一个类似的比喻。

第一步：以正切算子为例，其他算子比葫芦画瓢应该都会吧。

import paddle
from paddle.autograd import PyLayer

# 通过创建`PyLayer`子类的方式实现动态图Python Op
class cus_tanh(PyLayer):
    @staticmethod
    def forward(ctx, x):
        y = paddle.tanh(x)
        # ctx 为PyLayerContext对象，可以把y从forward传递到backward。
        ctx.save_for_backward(y)
        return y

    @staticmethod
    # 因为forward只有一个输出，因此除了ctx外，backward只有一个输入。
    def backward(ctx, dy):
        # ctx 为PyLayerContext对象，saved_tensor获取在forward时暂存的y。
        y, = ctx.saved_tensor()
        # 调用Paddle API自定义反向计算
        grad = dy * (1 - paddle.square(y))
        # forward只有一个Tensor输入，因此，backward只有一个输出。
        return grad

第二步：通过apply方法组建网络。 apply的输入为forward中除了第一个参数(ctx)以外的输入，apply的输出即为forward的输出。

data = paddle.randn([2, 3], dtype="float32")
data.stop_gradient = False
# 通过 apply运行这个Python算子
z = cus_tanh.apply(data)
z.mean().backward()

print(data.grad)

运行结果；

Tensor(shape=[2, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
       [[0.15810412, 0.16457692, 0.15256986],
        [0.16439323, 0.10458803, 0.16661729]])

Process finished with exit code 0

下面是一些注意事项，特殊情况特殊分析

为了从forward到backward传递信息，您可以在forward中给PyLayerContext添加临时属性，在backward中读取这个属性。如果传递Tensor推荐使用save_for_backward和saved_tensor，如果传递非Tensor推荐使用添加临时属性的方式。

import paddle
from paddle.autograd import PyLayer
import numpy as np

class tanh(PyLayer):
    @staticmethod
    def forward(ctx, x1, func1, func2=paddle.square):
        # 添加临时属性的方式传递func2
        ctx.func = func2
        y1 = func1(x1)
        # 使用save_for_backward传递y1
        ctx.save_for_backward(y1)
        return y1

    @staticmethod
    def backward(ctx, dy1):
        y1, = ctx.saved_tensor()
        # 获取func2
        re1 = dy1 * (1 - ctx.func(y1))
        return re1

input1 = paddle.randn([2, 3]).astype("float64")
input2 = input1.detach().clone()
input1.stop_gradient = False
input2.stop_gradient = False
z = tanh.apply(x1=input1, func1=paddle.tanh)

forward的输入和输出的类型任意，但是至少有一个输入和输出为Tensor类型。

# 错误示例
class cus_tanh(PyLayer):
    @staticmethod
    def forward(ctx, x1, x2):
        y = x1+x2
        # y.shape: 列表类型，非Tensor,输出至少包含一个Tensor
        return y.shape

    @staticmethod
    def backward(ctx, dy):
        return dy, dy

data = paddle.randn([2, 3], dtype="float32")
data.stop_gradient = False
# 由于forward输出没有Tensor引发报错
z, y_shape = cus_tanh.apply(data, data)


# 正确示例
class cus_tanh(PyLayer):
    @staticmethod
    def forward(ctx, x1, x2):
        y = x1+x2
        # y.shape: 列表类型，非Tensor
        return y, y.shape

    @staticmethod
    def backward(ctx, dy):
        # forward两个Tensor输入，因此，backward有两个输出。
        return dy, dy

data = paddle.randn([2, 3], dtype="float32")
data.stop_gradient = False
z, y_shape = cus_tanh.apply(data, data)
z.mean().backward()

print(data.grad)

如果forward的某个输入为Tensor且stop_gredient = True，则在backward中与其对应的返回值应为None。

class cus_tanh(PyLayer):
    @staticmethod
    def forward(ctx, x1, x2):
        y = x1+x2
        return y

    @staticmethod
    def backward(ctx, dy):
        # x2.stop_gradient=True，其对应梯度需要返回None
        return dy, None


data1 = paddle.randn([2, 3], dtype="float32")
data1.stop_gradient = False
data2 = paddle.randn([2, 3], dtype="float32")
z = cus_tanh.apply(data1, data2)
fake_loss = z.mean()
fake_loss.backward()
print(data1.grad)

如果forward的所有输入Tensor都是stop_gredient = True的，则backward不会被执行。

class cus_tanh(PyLayer):
    @staticmethod
    def forward(ctx, x1, x2):
        y = x1+x2
        return y

    @staticmethod
    def backward(ctx, dy):
        return dy, None


data1 = paddle.randn([2, 3], dtype="float32")
data2 = paddle.randn([2, 3], dtype="float32")
z = cus_tanh.apply(data1, data2)
fake_loss = z.mean()
fake_loss.backward()
# 因为data1.stop_gradient = True、data2.stop_gradient = True，所以backward不会被执行。
print(data1.grad is None)

自于静态的自定义算子，我不太感兴趣

可以参考官网

自定义Python算子-使用文档-PaddlePaddle深度学习平台