tensorflow2.0预备知识三：常用函数（2）_tensorflow2.0 中 gradienttape()函数详解-CSDN博客

本文链接：https://blog.csdn.net/anwarrior_qi/article/details/126388557

文章目录

一、TensorFlow2常用函数
结尾

一、TensorFlow2常用函数

1. tf.GradientTape()详解

描述：Tape是磁带，胶带的含义，GradientTape是在eager模式下计算模型的梯度，而eager模式是
TensorFlow2默认的模式，因此tf.GradientTape是官方推荐的梯度计算的用法。对于这个函数如何
使用，这里我从格式跟代码两个方面进行介绍。

一.这里给出一个函数格式（with结构是记录计算过程，gradient（）是对张量求梯度）：
	with tf.GradientTape as tape:
	 	若干个计算过程
	 	grad = tape.gradient(函数y，对谁求导x)
	 	#其中参数x必须使用tf.Variable()进行创建，因为GradientTape默认只监控这个函数所创建的变量
	 	#假如x=tf.constant(3.),则需要在with结构中手动添加tape.watch(x),使得变量x能够被监控

二.读者再结合代码示例，就能很清楚的明白了。

代码一:
	import tensorflow as tf
	x = tf.Variable(3.)
	with tf.GradientTape() as tape:
	    y = tf.pow(x, 4)		# 表达式y = x^4
	    dy_dx = tape.gradient(y, x)	# 函数y对x进行求导，得y_ = 4*x^3
	    print(dy_dx)				# 带入x = 3.0后, 得出梯度y_ = 108
	del tape


代码二:
	import tensorflow as tf
	x = tf.constant(3.)  #通过constant创建的变量
	with tf.GradientTape() as tape:
	    tape.watch(x)     #需要通过watch（）函数进行监控
	    y = tf.pow(x, 4)		# 表达式y = x^4
	    dy_dx = tape.gradient(y, x)	# 函数y对x进行求导，得y_ = 4*x^3
	    print(dy_dx)				# 带入x = 3.0后, 得出梯度y_ = 108
	del tape


#输出结果皆为：
	tf.Tensor(108.0, shape=(), dtype=float32)

PS：值得注意的是：默认情况下，GradientTape的资源在调用gradient（）函数之后就释放了。再次调用就无法进行
计算了。因此如果需要多次开启计算梯度，就需要开启persistent=True属性。例如：

x = tf.constant(3.0)
with tf.GradientTape(persistent=True) as tape:
    tape.watch(x)
    y = tf.pow(x, 4)
    z = y * y
    dz_dx = tape.gradient(z, x)  # z = y^2 = x^8, z’ = 8*x^7 = 8*3^7
    dy_dx = tape.gradient(y, x)  # y_ = 4*x^3
print(dz_dx)
print(dy_dx)

#输出结果：tf.Tensor(17496.0, shape=(), dtype=float32)
#        tf.Tensor(108.0, shape=(), dtype=float32)

2.enumerate()

 描述：enumerate()是python内置的函数，主要作用是遍历枚举列表，字典，数组中的元素， 组合为
 （序号，元素），常在for循环中使用。用法：enumerate（列表名）

seq = ['one','two','three']
for i,ele in enumerate(seq):
	print(i,ele)
	
#输出结果：
	0 one
	1 two
	2 three

3.tf.one_hot():

 描述：独热编码：在分类问题中，常常用独热编码做标签，标记类别：1表示是，0表示否
 函数格式：tf.one_hot(待转换数据，depth=几分类)

classes = 3
labels = tf.constant([1,0,2])
output = tf.one_hot(labels,depth=classes)
print(output)


#输出结果：
tf.Tensor(
[[0. 1. 0.]
[1. 0. 0.]
[0. 0. 1.]], shape=(3, 3), dtype=float32)

4.tf.nn.softmax()

 描述：softmax是个非常常用而且比较重要的函数，尤其在多分类的场景中使用广泛。他把一些输入
 映射为0-1之间的实数，并且归一化保证和为1，因此多分类的概率之和也刚好为1。当n分类的n个输出
 通过softmax()函数，便符合概率分布。

在这里插入图片描述
代码示例：

y =  tf.constant([1.01,2.01,-0.66])
output = tf.nn.softmax(y)
print(output)


#输出结果：
	tf.Tensor([0.25598174 0.69583046 0.04818781], shape=(3,), dtype=float32)

5.assign_sub()和assign_add()

 描述：assign_sub()用于函数的自减赋值操作，assgin_add()用于函数的自加赋值操作，调佣assgin_sub()前，先用tf.Variable()将变量定义为w（可训练）
 w.assign_sub（w要自减的内容）
 w.assign_add(w要自加的内容）

代码示例：

w = tf.Variable(4)
w.assign_sub(1)
print(w)
w.assign_add(2)
print(w)

#输出结果：
	<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=3>
	<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=5>

6.tf.argmax()

 描述：返回张量沿着指定维度的最大值索引号
 函数使用：tf.argmax(张量名，axis=操作轴)  #0表示列，1表示行

代码示例：

import numpy as np
test = np.array([[1,2,3],[3,4,5],[6,7,8],[23,12,10]])
print(test)
print(tf.argmax(test,axis=0))#按列
print(tf.argmax(test,axis=1))#按行

#输出结果：
	[[ 1  2  3]
	 [ 3  4  5]
	 [ 6  7  8]
	 [23 12 10]]
	tf.Tensor([3 3 3], shape=(3,), dtype=int64)
	tf.Tensor([2 2 2 0], shape=(4,), dtype=int64)
#ps：索引号是从0开始计数的

7.tf.where() ，tf.greater()

 描述：
 	tf.where(条件语句，真返回A，假返回B)：条件语句真返回A，条件语句假返回B
 	tf.greater(A,B):用来比较A跟B中各个元素的大小，返回的是一个列表

代码示例：

a = tf.constant([1,2,3,1,1])
b = tf.constant([0,1,3,4,5])
condition = tf.greater(a,b) #比较各个元素的大小，返回一个bool值的列表

c = tf.where(condition,a,b)#若a>b,返回a对应位置的元素，否则返回b对应位置的元素
print(condition)
print(c)

#输出结果：
#tf.Tensor([ True  True False False False], shape=(5,), dtype=bool)
#tf.Tensor([1 2 3 4 5], shape=(5,), dtype=int32)

8.np.random.RandomState.rand()

 描述：
 		返回一个【0,1)之间的随机数。
 		np.random.RandomState.rand(维度)  若维度为空，则返回标量

代码示例：

rdm = np.random.RandomState() 
a = rdm.rand()  #随机返回一个标量
b = rdm.rand(2,3) #返回维度为2行3列的随机数矩阵
print('a:',a)
print('b:',b)

#输出结果：
#a: 0.706104789102278
#b: [[0.19172613 0.10164097 0.0448493 ]
#   [0.29016672 0.83066028 0.68336689]]

9.np.vstack()

 描述:将两个数组按垂直方向叠加    np.vstack((数组一，数组二))

a = np.array([1,2,3])
b = np.array([3,4,5])
c = np.vstack((a,b))
print(c)


#输出结果：
#[[1 2 3]
# [3 4 5]]

10.np.mgrid[]

 描述：np.mgrid[参数1(起始值：结束值：步长)，参数2（起始值：结束值，步长），.......]
 		其中【起始值，结束值)
 		这个函数的文字解释并不是那么好懂，借助一下代码就比较简单了

代码示例：

x,y = np.mgrid[1:4:1,2:5:1]
print(x)
print('--------------------')
print(y)


#输出结果：
[[1 1 1]
 [2 2 2]
 [3 3 3]]
--------------------
[[2 3 4]
 [2 3 4]
 [2 3 4]]

解释：
	1.x的行数的值由参数一所决定，分别为1,2,3。列数（并不是元素数值）由参数二决定，列数的数值由np的广播机制
	  进行填充（可以理解为跟随第一列的数值，填充整个矩阵）
	2.y的列数的值由参数二所决定，分别为2,3,4。行数（并不是元素数值）由参数一决定，行数的数值由np的广播机制
	  进行填充
	(确实有点拗，加油学！)

11.x.ravel() 、np.c_[ ]

 描述：
 	x.ravel()将x变为一维数组，‘把.前的变量拉直’
 	np.c_[数组一，数组二，.....]：使得各个对应位置的元素进行配对

代码示例：

#x = [[1 1 1]
#	[2 2 2]
#	[3 3 3]]

#y = [[3 4 5]
#	 [3 4 5]
#	 [3 4 5]]

a = x.ravel()
b = y.ravel()
print(a,b)

#输出结果：
#   [1 1 1 2 2 2 3 3 3] [3 4 5 3 4 5 3 4 5]

grid = np.c_[a,b]
print(grid)

#输出结果：
[[1 3]
 [1 4]
 [1 5]
 [2 3]
 [2 4]
 [2 5]
 [3 3]
 [3 4]
 [3 5]]

结尾

这是学习tensorflow2后记录的第三篇文章，主要记录的多种常用函数。
（继续加油，未完待续！）