mxnet(gluon)学习之路-自动求导

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Iriving_shu/article/details/81432262

简介

mxnet 提供了自动求导的方法,相比于使用caffe需要自己写反向传播,这可以更加节约我们的时间。

求导

例如:
y = 4 * x^2
x = [[1,2; 3,4]]
dy/dx = 8x = [[8,16;24,32]]
使用mxnet自动求导:

    x = mx.nd.array([[1,2], [3,4]])
    x.attach_grad()
    with ag.record():
        y = 4 * (x ** 2)
    y.backward()
    print x.grad

结果如下:

[[  8.  16.]
 [ 24.  32.]]
<NDArray 2x2 @cpu(0)>

API接口

  1. x.attach_grad 用来开辟梯度存储空间
    通过 grad_req参数定义梯度更新方式,write,add, null

     def attach_grad(self, grad_req='write', stype=None):
        """Attach a gradient buffer to this NDArray, so that `backward`
        can compute gradient with respect to it.
    
        Parameters
        ----------
        grad_req : {'write', 'add', 'null'}
            How gradient will be accumulated.
            - 'write': gradient will be overwritten on every backward.
            - 'add': gradient will be added to existing value on every backward.
            - 'null': do not compute gradient for this NDArray.
        stype : str, optional
            The storage type of the gradient array. Defaults to the same stype of this NDArray.
        """
  2. ag.record()
    为了减少计算和内存开销,默认条件下 MXNet 不会记录用于求梯度的计算。我们需要调用record函数来要求 MXNet 记录与求梯度有关的计算。

def record(train_mode=True): #pylint: disable=redefined-outer-name
    """Returns an autograd recording scope context to be used in 'with' statement
    and captures code that needs gradients to be calculated.

    .. note:: When forwarding with train_mode=False, the corresponding backward
              should also use train_mode=False, otherwise gradient is undefined.

    Example::

        with autograd.record():
            y = model(x)
            backward([y])
        metric.update(...)
        optim.step(...)

    Parameters
    ----------
    train_mode: bool, default True
        Whether the forward pass is in training or predicting mode. This controls the behavior
        of some layers such as Dropout, BatchNorm.
    """
    return _RecordingStateScope(True, train_mode)

3.backward()计算反向传播

backward(self, out_grad=None, retain_graph=False, train_mode=True)

out_grad表示是否有头梯度,如果有需要将上一层的梯度给 out_grad。

展开阅读全文

没有更多推荐了,返回首页