参考:
1、Tensorflow-吴恩达老师课程笔记
2、利用 tf.gradients 在 TensorFlow 中实现梯度下降
1理论
2使用TensorFlow 内置的优化器对 数据集进行回归
3利用 tfgradients 在 TensorFlow 中实现梯度下降
1、理论
1、动量梯度下降法
参考:Tensorflow-吴恩达老师课程笔记
2、使用TensorFlow 内置的优化器对 数据集进行回归
原程序参考:TensorFlow 官方文档中文版
# -*- coding: utf8 -*-
import tensorflow as tf
import numpy as np
# 使用 NumPy 生成假数据(phony data), 总共 100 个点.
x_data = np.float32(np.random.rand(2, 100)) # 随机输入
y_data = np.dot([0.100, 0.200], x_data) + 0.300
# 构造一个线性模型
#
b = tf.Variable(tf.zeros([1]))
W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))
y = tf.matmul(W, x_data) + b
# 最小化方差
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)
# 初始化变量
init = tf.global_variables_initializer()
# 启动图 (graph)
sess = tf.Session()
sess.run(init)
# 拟合平面
for step in range(0, 201):
sess.run(train)
if step % 20 == 0:
print(step, sess.run(W), sess.run(b))
# 得到最佳拟合结果 W: [[0.100 0.200]], b: [0.300]
3、利用 tf.gradients 在 TensorFlow 中实现梯度下降
参考:利用 tf.gradients 在 TensorFlow 中实现梯度下降
# -*- coding: utf8 -*-
import tensorflow as tf
import numpy as np
# 使用 NumPy 生成假数据(phony data), 总共 100 个点.
x_data = np.float32(np.random.rand(2, 100)) # 随机输入
y_data = np.dot([0.100, 0.200], x_data) + 0.300
# 构造一个线性模型
#
b = tf.Variable(tf.zeros([1]))
W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))
y = tf.matmul(W, x_data) + b
# 最小化方差
loss = tf.reduce_mean(tf.square(y - y_data))
# optimizer = tf.train.GradientDescentOptimizer(0.5)
# train = optimizer.minimize(loss)
# Computing the gradient of cost with respect to W and b
grad_W, grad_b=tf.gradients(loss,[W,b])
learning_rate=0.5
# Gradient Step
new_W = W.assign(W - learning_rate * grad_W) # 相当于 W-=learning_rate*grad_W
new_b = b.assign(b - learning_rate * grad_b) # 相当于 b-=learning_rate*grad_b
# 初始化变量
init = tf.global_variables_initializer()
# 启动图 (graph)
sess = tf.Session()
sess.run(init)
# 拟合平面
for step in range(0, 201):
# sess.run(train)
# Fit training using batch data
# _, _, c = sess.run([new_W, new_b, loss])
_, _= sess.run([new_W, new_b])
if step % 20 == 0:
print(step, sess.run(W), sess.run(b))
# 得到最佳拟合结果 W: [[0.100 0.200]], b: [0.300]
---------------------
作者:风吴痕
来源:CSDN
原文:https://blog.csdn.net/wc781708249/article/details/79289763
版权声明:本文为博主原创文章,转载请附上博文链接!
tf.gradients
官方定义:
tf.gradients( ys, xs, grad_ys=None, name='gradients', stop_gradients=None, )
Constructs symbolic derivatives of sum of ys
w.r.t. x in xs
.
ys
and xs
are each a Tensor
or a list of tensors. grad_ys
is a list of Tensor
, holding the gradients received by theys
. The list must be the same length as ys
.
gradients()
adds ops to the graph to output the derivatives of ys
with respect to xs
. It returns a list of Tensor
of length len(xs)
where each tensor is the sum(dy/dx)
for y in ys
.
grad_ys
is a list of tensors of the same length as ys
that holds the initial gradients for each y in ys
. When grad_ys
is None, we fill in a tensor of '1's of the shape of y for each y in ys
. A user can provide their own initial grad_ys
to compute the derivatives using a different initial gradient for each y (e.g., if one wanted to weight the gradient differently for each value in each y).
stop_gradients
is a Tensor
or a list of tensors to be considered constant with respect to all xs
. These tensors will not be backpropagated through, as though they had been explicitly disconnected using stop_gradient
. Among other things, this allows computation of partial derivatives as opposed to total derivatives.
翻译:
1. xs和ys可以是一个张量,也可以是张量列表,tf.gradients(ys,xs) 实现的功能是求ys(如果ys是列表,那就是ys中所有元素之和)关于xs的导数(如果xs是列表,那就是xs中每一个元素分别求导),返回值是一个与xs长度相同的列表。
例如ys=[y1,y2,y3], xs=[x1,x2,x3,x4],那么tf.gradients(ys,xs)=[d(y1+y2+y3)/dx1,d(y1+y2+y3)/dx2,d(y1+y2+y3)/dx3,d(y1+y2+y3)/dx4].具体例子见下面代码第16-17行。
2. grad_ys 是ys的加权向量列表,和ys长度相同,当grad_ys=[q1,q2,g3]时,tf.gradients(ys,xs,grad_ys)=[d(g1*y1+g2*y2+g3*y3)/dx1,d(g1*y1+g2*y2+g3*y3)/dx2,d(g1*y1+g2*y2+g3*y3)/dx3,d(g1*y1+g2*y2+g3*y3)/dx4].具体例子见下面代码第19-21行。
3. stop_gradients使得指定变量不被求导,即视为常量,具体的例子见官方例子,此处省略。
1 import tensorflow as tf 2 w1 = tf.Variable([[1,2]]) 3 w2 = tf.Variable([[3,4]]) 4 res = tf.matmul(w1, [[2],[1]]) 5 6 #ys必须与xs有关,否则会报错 7 # grads = tf.gradients(res,[w1,w2]) 8 #TypeError: Fetch argument None has invalid type <class 'NoneType'> 9 10 # grads = tf.gradients(res,[w1]) 11 # # Result [array([[2, 1]])] 12 13 res2a=tf.matmul(w1, [[2],[1]])+tf.matmul(w2, [[3],[5]]) 14 res2b=tf.matmul(w1, [[2],[4]])+tf.matmul(w2, [[8],[6]]) 15 16 # grads = tf.gradients([res2a,res2b],[w1,w2]) 17 #result:[array([[4, 5]]), array([[11, 11]])] 18 19 grad_ys=[tf.Variable([[1]]),tf.Variable([[2]])] 20 grads = tf.gradients([res2a,res2b],[w1,w2],grad_ys=grad_ys) 21 # Result: [array([[6, 9]]), array([[19, 17]])] 22 23 with tf.Session() as sess: 24 tf.global_variables_initializer().run() 25 re = sess.run(grads) 26 print(re)