step(batch_size, ignore_stale_grad=False) method of mxnet.gluon.trainer.Trainer instance
Makes one step of parameter update. Should be called after
`autograd.compute_gradient` and outside of `record()` scope.
Parameters
----------
batch_size : int
Batch size of data processed. Gradient will be normalized by `1/batch_size`.
Set this to 1 if you normalized loss manually with `loss = mean(loss)`.
ignore_stale_grad : bool, optional, default=False
If true, ignores Parameters with stale gradient (gradient that has not
Makes one step of parameter update. Should be called after
`autograd.compute_gradient` and outside of `record()` scope.
Parameters
----------
batch_size : int
Batch size of data processed. Gradient will be normalized by `1/batch_size`.
Set this to 1 if you normalized loss manually with `loss = mean(loss)`.
ignore_stale_grad : bool, optional, default=False
If true, ignores Parameters with stale gradient (gradient that has not
been updated by `backward` after last step) and skip update.
上面是help打出来的文档,可见到有两个参数,第一个参数是batch_size,
第二个如果是true的话,就忽略那些参数,哪些参数呢,即它的梯度还没有被后向传播更新,也即stale gradient
,并且忽略更新,
默认情况下是false。