先记下已有的一些疑惑,待一个个解决吧:
1. 如下,为什么loss 需要显示地加在 summary里面? 如果没有 tf.summary.scalar 这句,在生成的graph 里面没有 model/loss 这个operation
self.loss = tf.add_n(losses) / len(losses) # total loss
tf.summary.scalar("model/loss", self.loss)
self.summary_op = tf.summary.merge_all()
2. tf.gradients() 用法
gradients()
adds ops to the graph to output the partial derivatives of ys
with respect to xs
.
It returns a list ofTensor
of length len(xs)
where each tensor is the sum(dy/dx)
for y in ys
.
Returns:
A list of sum(dy/dx)
for each x in xs
.
3. 只在一个gpu上做summary? LM()
if mode == "train"
cur_grads = self._backward(loss, summaries=(i == hps.num_gpus - 1))
tower_grads += [cur_grads]
4. * batch_size 不用下面的 似乎也可以
5.