TensorFlow2.X——tf.GradientTape() && Optimziers结合使用

最新推荐文章于 2022-10-26 20:22:26 发布

一只工程狮

最新推荐文章于 2022-10-26 20:22:26 发布

阅读量871

点赞数 1

分类专栏： TensorFlow

本文链接：https://blog.csdn.net/qq_40913465/article/details/104627573

版权

TensorFlow 专栏收录该内容

35 篇文章 16 订阅

订阅专栏

tf.GradientTape() && Optimzier结合使用

一、tf.GradientTape()
首先讲解一下tf.GradientTape() 中的参数：

tf.GradientTape(
persistent=False, watch_accessed_variables=True
)

persistent: 布尔值，用来指定新创建的gradientTape是否是可持续性的。默认是False，意味着只能够调用一次gradient（）函数，如果设置为True,则需要手动释放Tape的资源。
watch_accessed_variables: 布尔值，表明这个gradientTape是不是会自动追踪任何能被训练（trainable）的变量。默认是True。要是为False的话，意味着你需要手动去指定你想追踪的那些变量。

GradientTape默认只监控由tf.Variable创建的traiable=True属性（默认）的变量。如果要监控的参数是constant，则计算梯度需要增加g.watch(x)函数。

二、Optimziers

优化器也是大家非常熟悉的东西了，tensorflow 2.x也会把优化器移动到了tf.keras.optimizers,注意，这里所有的优化器里面一般会有几个更新梯度的常用函数：

apply_gradients(grads_and_vars,name=None)

作用：把计算出来的梯度更新到变量上面去。
参数:
- grads_and_vars: (gradient, variable) 对的列表.
- name: 操作名

代码演示：

import tensorflow as tf 
from tensorflow import keras

#定义一个函数
def g(x1, x2):
    return (x1 +5 ) * (x2 ** 2)

x1 = tf.Variable(2.0)
x2 = tf.Variable(3.0)

#使用梯度下降
with tf.GradientTape() as  tape :
    z = g(x1, x2)  #定义求导的函数
    
dz_x1 = tape.gradient(z, x1)
print(dz_x1)
print("\n")
#系统默认GradientTape只使用一次，如果不设置的话，再次使用会报错

try:
    dz_x2 = tape.gradient(z, x2)
except RuntimeError as ex:
    print(ex)

tf.Tensor(9.0, shape=(), dtype=float32)

GradientTape.gradient can only be called once on non-persistent tapes.

#设置persistent = True 则gradientTape可执行多次，但最后需要自己释放gradientTape资源

with tf.GradientTape(persistent = True) as  tape :
    z = g(x1, x2)  
    
dz_x1 = tape.gradient(z, x1)
dz_x2 = tape.gradient(z, x2)
print(dz_x1, dz_x2)

del tape   #释放资源

tf.Tensor(9.0, shape=(), dtype=float32) tf.Tensor(42.0, shape=(), dtype=float32)

# 同时对x1 和 x2 求偏导

with tf.GradientTape(persistent = True) as  tape :
    z = g(x1, x2)  
    
dz_x1x2 = tape.gradient(z, [x1, x2])

print(dz_x1x2)

[<tf.Tensor: id=61, shape=(), dtype=float32, numpy=9.0>, <tf.Tensor: id=67, shape=(), dtype=float32, numpy=42.0>]

'''
GradientTape默认只监控由tf.Variable创建的traiable=True属性（默认）的变量。
如果x是constant常量，因此计算梯度需要增加g.watch(x)函数
'''

#没有添加watch
x1 = tf.constant(2.0)
x2 = tf.constant(3.0)
with tf.GradientTape() as tape:
    z = g(x1, x2)
    
dz_x1x2 = tape.gradient(z, [x1, x2])

print(dz_x1x2)
print("\n")

#添加watch
x1 = tf.constant(2.0)
x2 = tf.constant(3.0)
with tf.GradientTape() as tape:
    tape.watch(x1)
    tape.watch(x2)
    z = g(x1, x2)
    
dz_x1x2 = tape.gradient(z, [x1, x2])

print(dz_x1x2)

del(tape)

[None, None]

[<tf.Tensor: id=83, shape=(), dtype=float32, numpy=9.0>, <tf.Tensor: id=89, shape=(), dtype=float32, numpy=42.0>]

#对两个函数进行求导
x = tf.Variable(5.0)

with  tf.GradientTape() as  tape:
    z1 = 3 * x
    z2 = x ** 2

tape.gradient([z1, z2], x)  #得到的是导数的累加

<tf.Tensor: id=112, shape=(), dtype=float32, numpy=13.0>

#求二阶导数，嵌套使用gradientTape
x1 = tf.Variable(2.0)
x2 = tf.Variable(3.0)
with tf.GradientTape(persistent=True ) as  out_tape:
    with  tf.GradientTape(persistent=True ) as  inner_tape:
        z = g(x1,x2)
    inner_grads = inner_tape.gradient(z, [x1, x2])
#这样是把二次求导，dz_x1x1, dz_x1x2, dz_x2,x1, dz_x2x2  依次求完之后带入数据
out_grads_1 = [out_tape.gradient(inner_grad, [x1, x2]) for inner_grad in inner_grads]

#这样是将dz_x1x1 + dz_x1x2,  dz_x2,x1 + dz_x2x2 显示，依次将个列表中的值相加
out_grads_2 = out_tape.gradient(inner_grads, [x1, x2])

print(out_grads_1)
print("\n")
print(out_grads_2)

del out_tape
del inner_tape

[[None, <tf.Tensor: id=149, shape=(), dtype=float32, numpy=6.0>], [<tf.Tensor: id=160, shape=(), dtype=float32, numpy=6.0>, <tf.Tensor: id=158, shape=(), dtype=float32, numpy=14.0>]]

[<tf.Tensor: id=179, shape=(), dtype=float32, numpy=6.0>, <tf.Tensor: id=180, shape=(), dtype=float32, numpy=20.0>]

#实现梯度下降 ： z = z - learning_rate * dz_x
x = tf.Variable(0.0)
learning_rate = 0.1 

for _ in range(100):
    with tf.GradientTape() as  tape:
        z = 3. * x **2 + 2. * x  - 1
    dz_x = tape.gradient(z, x)
    x.assign_sub(learning_rate * dz_x)
print(x)

<tf.Variable ‘Variable:0’ shape=() dtype=float32, numpy=-0.3333333>

#结合optimizer实现梯度下降

x = tf.Variable(0.0)
learning_rate = 0.1 

#选择优化函数
optimizer = keras.optimizers.SGD(lr = learning_rate)

for _ in range(100):
    with tf.GradientTape() as  tape:
        z = 3. * x ** 2 + 2. * x  - 1
    dz_x = tape.gradient(z, x)
    optimizer.apply_gradients([(dz_x, x)]) #注意列表中是一个元组
    
print(x)

<tf.Variable ‘Variable:0’ shape=() dtype=float32, numpy=-0.3333333>

一只工程狮

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow2.X——tf.GradientTape() && Optimziers结合使用

tf.GradientTape() && Optimzier结合使用==一、tf.GradientTape() ==首先讲解一下tf.GradientTape() 中的参数：tf.GradientTape( persistent=False, watch_accessed_variables=True)persistent: 布尔值，用来指定新创建的gradi...
复制链接

扫一扫