自动求导机制

最新推荐文章于 2023-07-19 09:08:39 发布

奋斗在阿尔卑斯的皮卡丘

最新推荐文章于 2023-07-19 09:08:39 发布

阅读量921

点赞数

分类专栏：深度学习文章标签：深度学习 python tensorflow

本文链接：https://blog.csdn.net/qq_37289115/article/details/109206099

版权

深度学习专栏收录该内容

28 篇文章 14 订阅

订阅专栏

自动求导–GradientTape

TensorFlow提供了一个专门用来求导的类GradientTape，可以形象的理解为记录梯度数据的磁带，通过它可以实现对变量的自动求导和监视。

GradientTape类实现了上下文管理器，它能够监视with语句块中所有的变量和计算过程，并把它们自动记录在梯度带中。

with GradientTape() as tape:
	函数表达式
grad = tape.gradient(函数，自变量)

GradientTape()是GradientTape类的构造函数

首先使用它来创建梯度带对象tape。 tape同时也是一个上下文管理器对象。然后把函数表达式或计算过程写在with语句块中，监视要求导的变量，最后使用tape对象的gradient()函数求得导数。

gradient()函数的第1个参数是被求导的函数，第2个参数是被求导的自变量。

函数或计算过程是需要我们来写的。例如，要求函数 $y=x^2|_{x=3}$ ，在 $x = 3$ 处的导数。

首先将x创建为tf.Variable对象，然后创建梯度带对象tape，并把函数表达式 $y=x^2$ 放在梯度带对象的with语句块中。这样x和y的数值和计算过程都会被自动的记录下来，然后通过tape对象的gradient()函数求得y对x的导数，最后输出y的值和导数值， $y=x^2$ 在 $x = 3$ 处的函数值是9，导数是6

with tf.GradientTape() as tape:
    y = tf.square(x)
dy_dx = tape.gradient(y, x)

print(y)
# tf.Tensor(9.0, shape=(), dtype=float32)
print(dy_dx)
# tf.Tensor(6.0, shape=(), dtype=float32)

在这里插入图片描述
可以看到输出结果和我们的计算结果是一致的。

GradientTape的构造函数有两个参数。

GradientTape(persistent, watch_accessed_variables)

第1个参数persistent，默认为False，表示这个tape只能使用一次，在求导之后就被销毁了，如果设置为True，那么就可以多次求导。
例如，这里增加了一个 $z^3$

x = tf.Variable(3.)

with tf.GradientTape() as tape:
    y = tf.square(x)
    z = pow(x, 3)
dy_dx = tape.gradient(y, x)
dy_dz = tape.gradient(y, z)

print(y)
print(dy_dx)
print(z)
print(dy_dz)

运行时会出现错误提示，非持续的tappe只能被调用一次
在这里插入图片描述
如果希望持续的使用tape对象，则需要将persistent的值设置为True，表示这个tape永久存在。要注意的是在这种情况下，在使用完tape的时候，需要del语句手动释放它

x = tf.Variable(3.)

with tf.GradientTape(persistent=True) as tape:
    y = tf.square(x)
    z = pow(x, 3)
dy_dx = tape.gradient(y, x)
dy_dz = tape.gradient(y, z)

print(y)
print(dy_dx)
print(z)
print(dy_dz)

del tape

这是运行的结果

tf.Tensor(9.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)
tf.Tensor(27.0, shape=(), dtype=float32)
tf.Tensor(27.0, shape=(), dtype=float32)

可以看到这里不仅计算出了y的值， y对x的导数值，还计算了z的值以及z对x的导数。

第2个参数watch_accessed_variables表示自动监视所有的可训练变量，也就是Variable对象，它的取值是布尔类型，默认为True。通过将参数设置为False，那么就无法自动监视x变量。

x = tf.Variable(3.)

with tf.GradientTape(watch_accessed_variables=False) as tape:
    y = tf.square(x)
dy_dx = tape.gradient(y, x)

print(y)
print(dy_dx)

tf.Tensor(9.0, shape=(), dtype=float32)
None

可以看到导数的输出为None

添加监视–watch()

在这种情况下，可以使用watch()函数手动添加对变量的监视，例如这里自动监视参数设为false，使用watch函数指定需要监视的变量x

x = tf.Variable(3.)
with tf.GradientTape(watch_accessed_variables=False) as tape:
    tape.watch(x)
    y = tf.square(x)
dy_dx = tape.gradient(y, x)

print(y)
print(dy_dx)

可以得到同样的结果，

tf.Tensor(9.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)

监视非可训练变量

GradientTape类默认自动监视所有的可训练变量，使用watch()函数还可以监视非可训练变量，例如这里使用tf.constant()的函数创建x，这个x是tensor对象，不是一个可训练变量。

x = tf.constant(3.)

with tf.GradientTape(watch_accessed_variables=False) as tape:
    tape.watch(x)
    y = tf.square(x)
    
dy_dx = tape.gradient(y, x)

print(y)
print(dy_dx)

在这里增加一句tape.watch(x)，就能够实现对x的监视，得到相同的结果

tf.Tensor(9.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)

多元函数求偏导数

在机器学习中，通常需要对多元函数求偏导数。

tape.gradient(函数，自变量)

gradient()函数中的第2个参数指明求导的自变量，这里的自变量可以是一个也可以是多个。当需要对多个自变量求偏导数时，只要把所有的自变量都放在一个列表中就可以了。
例如这是一个二元函数 $f(x, y)=x^2+2y^2+1$ ，自变量为x和y。
在这里插入图片描述
下面用编程实现

x = tf.Variable(3.)
y = tf.Variable(4.)

with tf.GradientTape() as tape:
    f = tf.square(x) + 2 * tf.square(y) + 1

df_dx, df_dy = tape.gradient(f, [x, y])

print(f)
print(df_dx)
print(df_dy)

这是运行结果

tf.Tensor(42.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)
tf.Tensor(16.0, shape=(), dtype=float32)

也可以使用一个变量名来接收两个偏导数的结果，这时返回值是一个列表，其中包括两个元素，每个元素都是一个张量

x = tf.Variable(3.)
y = tf.Variable(4.)

with tf.GradientTape() as tape:
    f = tf.square(x) + 2 * tf.square(y) + 1

first_grads = tape.gradient(f, [x, y])

print(f)
print(first_grads)

[<tf.Tensor: id=27, shape=(), dtype=float32, numpy=6.0>, <tf.Tensor: id=32, shape=(), dtype=float32, numpy=16.0>]

求二阶导数

通过梯度带还可以求解高阶导数。

这是求二阶导数的例子，这时需要使用双重with语句

内层的with语句创建梯度带对象tape1，并使用这个tape1，来计算一阶导数first_grads。
外层with语句创建梯度带对象tape2，这个tape2使用一阶导数的结果作为被求导的函数，再对它求一次导得到二阶导数

x = tf.Variable(3.)
y = tf.Variable(4.)

with tf.GradientTape(persistent=True) as tape2:
    with tf.GradientTape(persistent=True) as tape1:
        f = tf.square(x) + 2 * tf.square(y) + 1

    first_grads = tape1.gradient(f, [x, y])
second_grads = [tape2.gradient(first_grads, [x, y])]

print(f)
print(first_grads)
print(second_grads)

del tape1
del tape2

这是运行的结果，这是一阶偏导数分别是6和16
在这里插入图片描述
这是二阶偏导数分别是2和4

对向量求偏导

在机器学习中，经常需要对向量或者矩阵求导，例如这里x和y都是长度为3的，一维张量。求导的过程不变，结果也是长度为3的一维张量

x = tf.Variable([1., 2., 3.])
y = tf.Variable([4., 5., 6.])

with tf.GradientTape() as tape:
    f = tf.square(x) + 2 * tf.square(y) + 1

df_dx, df_dy = tape.gradient(f, [x, y])

print(f)
print(df_dx)
print(df_dy)

tf.Tensor([34. 55. 82.], shape=(3,), dtype=float32)
tf.Tensor([2. 4. 6.], shape=(3,), dtype=float32)
tf.Tensor([16. 20. 24.], shape=(3,), dtype=float32)

通过以上这些例子，可以看到使用tensorflow的自动求导机制，我们在编写程序的时候，不用再自己去写求导数的计算公式，只要告诉梯度带对象，需要对什么函数，求什么样的导数就可以了，他会自动的为我们计算需要的导数

奋斗在阿尔卑斯的皮卡丘

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
自动求导机制

自动求导–GradientTapeTensorFlow提供了一个专门用来求导的类GradientTape，可以形象的理解为记录梯度数据的磁带，通过它可以实现对变量的自动求导和监视。GradientTape类实现了上下文管理器，它能够监视with语句块中所有的变量和计算过程，并把它们自动记录在梯度带中。with GradientTape() as tape: 函数表达式grad = tape.gradient(函数，自变量)GradientTape()是GradientTape类的构造函数首先
复制链接

扫一扫

专栏目录