Tensorflow先实现梯度多次累加，再进行一次反向传播的过程

最新推荐文章于 2022-08-13 19:30:00 发布

NoBugPerfect

最新推荐文章于 2022-08-13 19:30:00 发布

阅读量907

点赞数

分类专栏：深度学习文章标签： tensorflow 深度学习机器学习梯度传播

原文链接：https://blog.csdn.net/weixin_41560402/article/details/106930463

版权

深度学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

在Tensorflow中，每次sess.run(self.optimizer)时，都会同时计算梯度并且更新变量。但是在pytorch中，可以通过三个步骤实现：

梯度清零： optimizer.zero_grad()
反向传播计算每个参数的梯度 loss.backward()
梯度下降并更新参数 optimizer.step()

如果想实现多次计算梯度后，再统一更新一次梯度（由minibatch实现大的batchsize）. 或者（长序列插帧内部迭代完整个序列后再传播一次梯度）。就需要用到分开计算梯度和更新参数了。在pytorch中很容易实现，但是在tensorflow中需要自己写计算梯度和更新参数的过程。

import tensorflow as tf
import numpy as np
import os

os.environment['CUDA_VISIBLE_DEVICES']='0'
x_data= np.array(range(1,20))
num_dataset = len(x_data)
batchsize= 4
minibatch_size = 2
with tf.graph().as_default():
	x = tf.placeholder(dype='float32'. shape = None)
	 w = tf.Variable(initial_value=4., dtype='float32')
    loss = w * w * x

    # Optimizer definition - nothing different from any classical example
    opt = tf.train.GradientDescentOptimizer(0.1)

    # Retrieve all trainable variables you defined in your graph
    tvs = tf.trainable_variables()

    # Creation of a list of variables with the same shape as the trainable ones
    # initialized with zeros
    accum_vars = [tf.Variable(tf.zeros_like(tv.initialized_value()), trainable=False) for tv in tvs]
    zero_ops = [tv.assign(tf.zeros_like(tv)) for tv in accum_vars]

    # Calls the compute_gradients function of the optimizer to obtain the list of gradients
    gvs = opt.compute_gradients(loss, tvs)

    # Adds to each element from the list you initialized earlier with zeros its gradient
    # (works because accum_vars and gvs are in the same order)
    accum_ops = [accum_vars[i].assign_add(gv[0]) for i, gv in enumerate(gvs)]

    # Define the training step (part with variable value update)
    train_step = opt.apply_gradients([(accum_vars[i], gv[1]) for i, gv in enumerate(gvs)])

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        for batch_count in range(batch_size):
            # 在run每个batch, 需先将前一个batch所得的累积梯度清零
            sess.run(zero_ops)

            batch_data = x_data[batch_count*batch_size: (batch_count+1)*batch_size]
            # Accumulate the gradients 'minibatch_size' times in accum_vars using accum_ops
            for minibatch_count in range(minibatch_size):
                minibatch_data = batch_data[minibatch_count*minibatch_size: (minibatch_count+1)*minibatch_size]
                accum_array = sess.run(accum_ops, feed_dict={x: minibatch_data})
                print("[%d][%d]" % (batch_count, minibatch_count), accum_array)
                print(sess.run(tvs))
            # Run the train_step ops to update the weights based on your accumulated gradients
            sess.run(train_step)
————————————————
版权声明：本文为CSDN博主「dekiang」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/weixin_41560402/article/details/106930463

NoBugPerfect

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Tensorflow先实现梯度多次累加，再进行一次反向传播的过程

在Tensorflow中，每次sess.run(self.optimizer)时，都会同时计算梯度并且更新变量。但是在pytorch中，可以通过三个步骤实现：梯度清零： optimizer.zero_grad()反向传播计算每个参数的梯度 loss.backward()梯度下降并更新参数 optimizer.step()如果想实现多次计算梯度后，再统一更新一次梯度（由minibatch实现大的batchsize）. 或者（长序列插帧内部迭代完整个序列后再传播一次梯度）。就需要用到分开计算梯
复制链接

扫一扫

专栏目录