官方文档如下:
tf.train.Optimizer.minimize(loss, global_step=None, var_list=None, gate_gradients=1, aggregation_method=None, colocate_gradients_with_ops=False, name=None, grad_loss=None)
Add operations to minimize loss
by updating var_list
.
This method simply combines calls compute_gradients()
and apply_gradients()
. If you want to process the gradient before applying them call compute_gradients()
and apply_gradients()
explicitly instead of using this function.
Args:
loss
: ATensor
containing the value to minimize.global_step
: OptionalVariable
to increment by one after the variables have been updated.var_list
: Optional list ofVariable
objects to update to minimizeloss
. Defaults to the list of variables collected in the graph under the keyGraphKeys.TRAINABLE_VARIABLES
.gate_gradients
: How to gate the computation of gradients. Can beGATE_NONE
,GATE_OP
, orGATE_GRAPH
.aggregation_method
: Specifies the method used to combine gradient terms. Valid values are defined in the classAggregationMethod
.colocate_gradients_with_ops
: If True, try colocating gradients with the corresponding op.name
: Optional name for the returned operation.grad_loss
: Optional. ATensor
holding the gradient computed forloss
.
Returns:
An Operation that updates the variables in var_list
. If global_step
was not None
, that operation also increments global_step
.
其中有三个参数需要注意:
(1)loss:即最小化的目标变量,一般就是训练的目标函数,均方差或者交叉熵;
(2)global_step:梯度下降一次加1,一般用于记录迭代优化的次数,主要用于参数输出和保存;
(3)var_list 每次要迭代更新的参数集合。