mxnet(gluon)学习之路-自动求导

最新推荐文章于 2024-07-18 18:56:10 发布

Iriving_shu

最新推荐文章于 2024-07-18 18:56:10 发布

阅读量1.5k

点赞数

分类专栏： mxnet-gluon之路

本文链接：https://blog.csdn.net/Iriving_shu/article/details/81432262

版权

mxnet-gluon之路专栏收录该内容

5 篇文章 2 订阅

订阅专栏

简介

mxnet 提供了自动求导的方法，相比于使用caffe需要自己写反向传播，这可以更加节约我们的时间。

求导

例如：
y = 4 * x^2
x = [[1,2; 3,4]]
dy/dx = 8x = [[8,16;24,32]]
使用mxnet自动求导：

    x = mx.nd.array([[1,2], [3,4]])
    x.attach_grad()
    with ag.record():
        y = 4 * (x ** 2)
    y.backward()
    print x.grad

结果如下：

[[  8.  16.]
 [ 24.  32.]]
<NDArray 2x2 @cpu(0)>

API接口

x.attach_grad 用来开辟梯度存储空间
通过 grad_req参数定义梯度更新方式，write，add, null

 def attach_grad(self, grad_req='write', stype=None):
    """Attach a gradient buffer to this NDArray, so that `backward`
    can compute gradient with respect to it.

    Parameters
    ----------
    grad_req : {'write', 'add', 'null'}
        How gradient will be accumulated.
        - 'write': gradient will be overwritten on every backward.
        - 'add': gradient will be added to existing value on every backward.
        - 'null': do not compute gradient for this NDArray.
    stype : str, optional
        The storage type of the gradient array. Defaults to the same stype of this NDArray.
    """

ag.record()
为了减少计算和内存开销，默认条件下 MXNet 不会记录用于求梯度的计算。我们需要调用record函数来要求 MXNet 记录与求梯度有关的计算。

def record(train_mode=True): #pylint: disable=redefined-outer-name
    """Returns an autograd recording scope context to be used in 'with' statement
    and captures code that needs gradients to be calculated.

    .. note:: When forwarding with train_mode=False, the corresponding backward
              should also use train_mode=False, otherwise gradient is undefined.

    Example::

        with autograd.record():
            y = model(x)
            backward([y])
        metric.update(...)
        optim.step(...)

    Parameters
    ----------
    train_mode: bool, default True
        Whether the forward pass is in training or predicting mode. This controls the behavior
        of some layers such as Dropout, BatchNorm.
    """
    return _RecordingStateScope(True, train_mode)

3.backward()计算反向传播
backward(self, out_grad=None, retain_graph=False, train_mode=True)
out_grad表示是否有头梯度，如果有需要将上一层的梯度给 out_grad。