Tensorflow2 基础

1. 官方文档

https://www.tensorflow.org/versions

2. 创建 tf.Tensor

在 TensorFlow 中,张量(Tensor)是指多维数组,可以理解为是一种数据结构。它可以表示各种类型的数据,如标量、向量、矩阵等。在 TensorFlow 中,所有的数据都是通过张量的形式进行传递和处理的。

张量有以下几个重要的属性:

1. 阶(Rank or Dim):张量的阶指的是它的维度数,也就是它有多少个轴。例如,标量的阶为0,向量的阶为1,矩阵的阶为2。

2. 形状(Shape):张量的形状指的是它各个轴上的维度大小,用一个元组来表示。例如,一个形状为(3, 4)的张量表示一个3行4列的矩阵。

3. 数据类型(Data Type):张量的数据类型指的是它包含的数据的类型,如 float32、int32等。

a = tf.constant([3, 4])
type(a)
# tensorflow.python.framework.ops.EagerTensor

a.device
# '/job:localhost/replica:0/task:0/device:CPU:0'


a.dtype
# tf.int32

a.numpy()
# array([3, 4])

tf.is_tensor(a)
# True

a.ndim
# 1
tf.rank(a)
# <tf.Tensor: shape=(), dtype=int32, numpy=1>

a.shape
# TensorShape([2])
tf.shape(a)
# <tf.Tensor: shape=(1,), dtype=int32, numpy=array([2])>


​

TensorFlow supports eager execution and graph execution. In eager execution, operations are evaluated immediately. In graph execution, a computational graph is constructed for later evaluation. TensorFlow defaults to eager execution.

import tensorflow as tf
import numpy as np

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'


a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[1.0, 1.0], [0.0, 1.0]])
c = tf.matmul(a, b)
print(type(c))
# <class 'tensorflow.python.framework.ops.EagerTensor'>

2.1 tf.constant

tf.constant(1)
# <tf.Tensor: shape=(), dtype=int32, numpy=1>

tf.constant(1.)
# <tf.Tensor: shape=(), dtype=float32, numpy=1.0>

tf.constant(2.2, dtype=tf.double)
# <tf.Tensor: shape=(), dtype=float64, numpy=2.2>

tf.constant(True)
# <tf.Tensor: shape=(), dtype=bool, numpy=True>

tf.constant('hello world')
# <tf.Tensor: shape=(), dtype=string, numpy=b'hello world'>

tf.constant([[1, 2], [3, 4]])
# <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
# array([[1, 2],
#        [3, 4]])>

2.2 tf.convert_to_tensor:由 Numpy、List 创建 Tensor

tf.convert_to_tensor(np.ones([2, 3]))

tf.convert_to_tensor(np.zeros([2, 3]))

tf.convert_to_tensor([1, 2])

tf.convert_to_tensor([1, 2.])

tf.convert_to_tensor([1, 2], dtype=tf.float32)

2.3 tf.cast 转换 tensor 的数据类型

2.4 tf.zeros

tf.zeros([])  # scalar, a single data

tf.zeros([1])

tf.zeros([2, 2])

tf.zeros([2, 3, 3])

2.5 tf.zeros_like

a = tf.zeros([3, 4])
tf.zeros_like(a)

tf.zeros(a.shape)

2.6 tf.ones 与 tf.ones_like

tf.ones([])

tf.ones([1])

tf.ones([2, 3])

a = tf.zeros([3, 4])
tf.ones_like(a)

2.7 tf.fill

tf.fill([2, 3], 1)

tf.fill([2, 3], 1.)

tf.fill([2, 3], 9)

2.8 tf.random.normal、tf.random.truncated_normal

tf.normal 和 tf.truncated_normal 都是 TensorFlow 中用于生成正态分布随机数的函数。

tf.random.normal([2, 2], mean=10, stddev=1)

tf.random.normal([2, 2])

tf.random.truncated_normal([2, 2], mean=1, stddev=1)

tf.truncated_normal 与 tf.normal 的区别在于,它会将生成的随机数截断到均值两倍标准差之内,这意味着,它生成的随机数不会偏离均值太远,而且不会出现极端值。

tf.truncated_normal: Outputs random values from a truncated normal distribution.The values are drawn from a normal distribution with specified mean and standard deviation, discarding and re-drawing any samples that are more than two standard deviations from the mean.

2.9 tf.random.uniform

tf.random.uniform([2, 2], minval=0, maxval=10)

a = tf.random.normal([10, 28])
b = tf.random.uniform([10], maxval=10, dtype=tf.int32)
# b
# <tf.Tensor: shape=(10,), dtype=int32, numpy=array([9, 0, 7, 4, 3, 1, 9, 3, 7, 8])>

idx = tf.range(10)
idx = tf.random.shuffle(idx)
# idx
# <tf.Tensor: shape=(10,), dtype=int32, numpy=array([7, 1, 3, 4, 8, 5, 2, 9, 0, 6])>

a = tf.gather(a, idx)
b = tf.gather(b, idx)
# b
# <tf.Tensor: shape=(10,), dtype=int32, numpy=array([3, 0, 4, 3, 7, 1, 7, 8, 9, 9])>

2.10 应用场景总结 Typical Dim Data 

DimExample
0 scalar []loss or accuracy []
1 vectorbias [d]
2 matrixweight [input_dim, out_put_dim]
sentence [b, seq_len, word_dim]
4image [b, h, w, c]
5meta-learing [task_b, b, h, w, c]

3. 索引与切片

3.1 [idx][idx][idx]

a = tf.constant(range(24), shape=[2, 3, 4])
a

a[0]

a[0][1]

a[0][0][3]

3.2 [idx, idx, idx, ...]

a = tf.constant(range(24), shape=[2, 3, 4])
a

a[0, 1]

a[0, 0, 3]

3.3 单冒号 & 双冒号

start:end

a = tf.range(10)
a
# <tf.Tensor: shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])>

a[:4]
# <tf.Tensor: shape=(4,), dtype=int32, numpy=array([0, 1, 2, 3])>

a[4:8]
# <tf.Tensor: shape=(4,), dtype=int32, numpy=array([4, 5, 6, 7])>

a[4:]
# <tf.Tensor: shape=(6,), dtype=int32, numpy=array([4, 5, 6, 7, 8, 9])>

a[-2:]
# <tf.Tensor: shape=(2,), dtype=int32, numpy=array([8, 9])>

a[:-4]
# <tf.Tensor: shape=(6,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5])>
a = tf.ones([4, 28, 28, 3])

a[:, :14, 14:, :].shape
# TensorShape([4, 14, 14, 3])

a[:, 14:, 14:, :].shape
# TensorShape([4, 14, 14, 3])


a = tf.constant(range(240), shape=[4, 10, 3, 2])

a[0, 1, :, :].shape
# TensorShape([3, 2])

a[:, :, :, 1].shape
# TensorShape([4, 10, 3])

a[:, 3, :, 1].shape
# TensorShape([4, 3])

a[0, 3:10, :, :].shape
# TensorShape([7, 3, 2])

start:end:step

a = tf.ones([4, 28, 28, 3])

a[:, 0:28:2, 0:28:2, :].shape
# TensorShape([4, 14, 14, 3])

a[:, ::2, ::2, :].shape
# TensorShape([4, 14, 14, 3])

倒序 ::-1

a = tf.range(15)
a
# <tf.Tensor: shape=(15,), dtype=int32, numpy=array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])>

a[::-2]
# <tf.Tensor: shape=(8,), dtype=int32, numpy=array([14, 12, 10,  8,  6,  4,  2,  0])>

a[8:2:-2]
# <tf.Tensor: shape=(3,), dtype=int32, numpy=array([8, 6, 4])>

...

a = tf.ones([2, 4, 28, 28, 3])

a[0, ...].shape
# TensorShape([4, 28, 28, 3])

a[..., 1].shape
# TensorShape([2, 4, 28, 28])

a[0, ..., 1].shape
# TensorShape([4, 28, 28]

3.4 selective indexing

tf.gather

gather 函数的作用是从输入张量的指定维度上收集索引指定的元素,返回收集到的元素组成的新张量。

gather 函数的作用是根据给定的索引在指定的维度上收集值,它的参数包括:

  • input: 源张量
  • dim: 指定的维度
  • index: 要收集的元素的索引
a = tf.random.truncated_normal([4, 35, 8], mean=80, stddev=10)

tf.gather(a, axis=0, indices=[2, 3]).shape
# TensorShape([2, 35, 8])

tf.gather(a, axis=0, indices=[3, 0, 2, 1]).shape
# TensorShape([4, 35, 8])

tf.gather(a, axis=1, indices=[3, 5, 2, 6, 7, 0]).shape
# TensorShape([4, 6, 8])

tf.gather(a, axis=2, indices=[2, 3, 5]).shape
# TensorShape([4, 35, 3])


aa = tf.gather(a, axis=1, indices=[2, 4, 6])
aaa = tf.gather(aa, axis=2, indices=[1])
aaa.shape
# ensorShape([4, 3, 1])

tf.gather_nd

gather_nd 函数的作用是从输入张量的指定维度上收集索引指定的元素(可以是多维索引),返回收集到的元素组成的新张量。

gather_nd 函数的作用是根据给定的索引从多维张量中收集值。它的参数包括:

  • input: 源张量
  • index: 要收集的元素的索引
a = tf.random.truncated_normal([4, 35, 8], mean=80, stddev=10)

tf.gather_nd(a, [[0, 1, 1], [0, 2, 2], [0, 3, 3]]).shape
# TensorShape([3])

tf.gather_nd(a, [[0, 0], [1, 1], [2, 2]]).shape
# TensorShape([3, 8])

tf.boolean_mask

a = tf.random.truncated_normal([4, 2, 3], mean=80, stddev=10)

tf.boolean_mask(a, mask=[True, True, False, False]).shape
# TensorShape([2, 2, 3])

tf.boolean_mask(a, mask=[True, False], axis=1).shape
# TensorShape([4, 1, 3])
a = tf.constant(range(24), shape=[4, 2, 3])
a
tf.boolean_mask(a, mask=[[True, True], [True, False], [False, False], [False, True]])

4. 维度变换

4.1 tf.reshape (View)

a = tf.random.normal([4, 28, 28, 3])

a.shape  # TensorShape([4, 28, 28, 3])
a.ndim  # 4

tf.reshape(a, [4, 784, 3]).shape
# TensorShape([4, 784, 3])

tf.reshape(a, [4, -1, 3]).shape
# TensorShape([4, 784, 3])

tf.reshape(a, [4, 2, 14, 28, 3]).shape
# TensorShape([4, 2, 14, 28, 3])

tf.reshape(a, [4, -1]).shape
# TensorShape([4, 2352])

tf.reshape(tf.reshape(a, [4, -1]), [4, 14, 56, 3]).shape
# TensorShape([4, 14, 56, 3])

4.2 tf.transpose (Content)

a = tf.random.normal([1, 2, 3, 4])

a.shape
# TensorShape([1, 2, 3, 4])

tf.transpose(a).shape
# TensorShape([4, 3, 2, 1])

tf.transpose(a, perm=[0, 2, 1, 3]).shape
# TensorShape([1, 3, 2, 4])
# b h w c
a = tf.random.normal([4, 28, 28, 3])
a.shape
# TensorShape([4, 28, 28, 3])

# b c h w
tf.transpose(a, [0, 3, 1, 2]).shape
# TensorShape([4, 3, 28, 28])

4.3 expand/squeeze dims

tf.expand_dims

axis 为正数时,往前增加维度;axis 为负数时,往后增加维度。比如 参数 axis=0 表示在 axis0 之前加一个维度;参数 axis=3 表示在 axis2 和 axis3 之间的位置加一个维度。

a = tf.random.normal([4, 35, 8])

tf.expand_dims(a, axis=0).shape
# TensorShape([1, 4, 35, 8])

tf.expand_dims(a, axis=1).shape
# TensorShape([4, 1, 35, 8])

tf.expand_dims(a, axis=3).shape
# TensorShape([4, 35, 8, 1])

tf.expand_dims(a, axis=-1).shape
# TensorShape([4, 35, 8, 1])

tf.expand_dims(a, axis=-4).shape
# ensorShape([1, 4, 35, 8])

only squeeze for shape=1 dim

tf.squeeze

a = tf.zeros([1, 2, 1, 3])

tf.squeeze(a).shape
# TensorShape([2, 3])

tf.squeeze(a, axis=0).shape
# TensorShape([2, 1, 3])

tf.squeeze(a, axis=2).shape
# TensorShape([1, 2, 3])

tf.squeeze(a, axis=-2).shape
# TensorShape([1, 2, 3])

tf.squeeze(a, axis=-4).shape
# TensorShape([2, 1, 3])

4.4 broadcast

TensorFlow 的 broadcasting 是一种广播机制,它可以在不显式复制数据的情况下,自动扩展张量形状,以便进行逐元素的操作,实现对不同形状的张量的运算。它的原理如下:

1. 扩展维度:如果两个张量的形状不同,需要将形状较小的张量扩展维度,使得它们的维度数相同。这个过程可以使用 tf.expand_dims 函数来实现。

2. 复制张量:如果两个张量的形状在某些维度上不同,需要将形状较小的张量进行复制,使得它们的维度大小相同。这个过程可以使用 tf.broadcast_to 函数来实现。

3. 进行运算:将扩展维度和复制张量的结果进行运算,得到最终的结果。

总结:先对齐小维度(从右边对齐) -> 维度数不匹配时,向前插入维度(新插入的维度 shape=1)-> 把 shape=1 的维度扩张为与目标对象相同的 shape。

broadcast 是一种优化运行的手段,当进行张量运算时,如果两个张量的形状不完全相同,但它们的形状在某些维度上是兼容的,那么 TensorFlow 会自动将它们扩展到相同的形状,以便进行运算。

x = tf.random.normal([4, 32, 32, 3])

(x + tf.random.normal([3])).shape
# TensorShape([4, 32, 32, 3])

(x + tf.random.normal([32, 1])).shape
# TensorShape([4, 32, 32, 3])

(x + tf.random.normal([4, 1, 1, 1])).shape
# TensorShape([4, 32, 32, 3])

# (x + tf.random.normal([1, 4, 1, 1])).shape
# InvalidArgumentError: Incompatible shapes: [4,32,32,3] vs. [1,4,1,1] [Op:AddV2]

显示调用 tf.broadcast_to

x = tf.random.normal([4, 32, 32, 3])
b = tf.broadcast_to(tf.random.normal([4, 1, 1, 1]), x.shape)
b.shape
# TensorShape([4, 32, 32, 3])

(x+b).shape
# TensorShape([4, 32, 32, 3])

tf.tile (占用内存)

a = tf.ones([3, 4])
# [3, 4] -> [2, 3, 4]

# way1 tf.broadcast_to
a1 = tf.broadcast_to(a, [2, 3, 4])
a1.shape
# TensorShape([2, 3, 4])

# way2 tf.expand_dims + tf.tile
a2 = tf.expand_dims(a, axis=0)
a2 = tf.tile(a2, [2, 1, 1])
a2.shape
# TensorShape([2, 3, 4])

5. 进阶操作

5.1 数学运算

+ - * /element-wise
// %element-wise
**, tf.pow, tf.square, tf.sqrtelement-wise
tf.exp, tf.math.logelement-wise
矩阵乘法 @ matmulmatrix-wise

reduce_mean、reduce_max、reduce_min、reduce_sum

dim-wise

+-*/ % //

a = tf.ones([2, 2])
b = tf.fill([2, 2], 2.)

a + b, a - b, a * b, a / b

a // b, a % b

tf.math.log(a)

tf.exp(a)

tf.exp, tf.math.log

# log 以2为底,8的对数
tf.math.log(8.) / tf.math.log(2.)
# <tf.Tensor: shape=(), dtype=float32, numpy=3.0>

# log 以10为底,100的对数
tf.math.log(100.) / tf.math.log(10.)
# <tf.Tensor: shape=(), dtype=float32, numpy=2.0>

**, tf.pow, tf.square, tf.sqrt

b = tf.fill([2, 2], 2.)
tf.pow(b, 3)

tf.square(b)

b ** 3

tf.sqrt(b)

@, matmul

a = tf.ones([2, 2])
b = tf.fill([2, 2], 2.)

a@b

tf.matmul(a, b)

a = tf.ones([4, 2, 3])
b = tf.fill([4, 3, 5], 2.)

(a@b).shape
# TensorShape([4, 2, 5])

tf.matmul(a, b).shape
# TensorShape([4, 2, 5])

c = tf.fill([3, 5], 2.)
(a@c).shape
# ensorShape([4, 2, 5])

5.2 数据统计

tf.reduce_min, tf.reduce_max, tf.reduce_mean, tf.reduce_sum

a = tf.random.uniform([3, 5], minval=1, maxval=10, dtype=tf.int32)
a

tf.reduce_min(a), tf.reduce_max(a), tf.reduce_mean(a), tf.reduce_sum(a)

tf.reduce_min(a, axis=0)

tf.reduce_min(a, axis=1)

tf.norm(参数 ord 默认为2, 表示2范数;ord=1 表示 1范数)

a = tf.random.normal([3,5])
# 矩阵 L2范数
tf.norm(a)
tf.sqrt(tf.reduce_sum(tf.square(a)))

# 向量 L2范数
tf.norm(a, ord=2, axis=0)

# 矩阵 L1范数
tf.norm(a, ord=1)
tf.norm(a, ord=1, axis=0)
tf.norm(a, ord=1, axis=1)

# 更复杂一点的矩阵
b = tf.random.normal([4, 35, 8])
tf.norm(b)
tf.sqrt(tf.reduce_sum(tf.square(b)))

tf.argmax, tf.argmin

a = tf.random.uniform([4, 10], minval=1, maxval=10, dtype=tf.int32)
a

tf.argmax(a)

tf.argmin(a)

tf.argmin(a, axis=1)

tf.equal

a = tf.constant([1, 2, 3, 1, 5], dtype=tf.float32)
b = tf.ones(a.shape)
b.dtype

tf.equal(a, b)
# 统计相等项个数,即True的个数
tf.reduce_sum(tf.cast(tf.equal(a, b), dtype=tf.int32))

应用:分类问题,计算预测精度
out = tf.constant([
    [0.1, 0.2, 0.7], 
    [0.9, 0.05, 0.05],
    [0.8, 0.1, 0.1],
    [0.3, 0.6, 0.1]
])
pred = tf.cast(tf.argmax(out, axis=1), dtype=tf.int32)

y = tf.constant([2, 0, 1, 1], dtype=tf.int32)
bool_res = tf.equal(y, pred)
correct = tf.reduce_sum(tf.cast(bool_res, dtype=tf.int32))
accuracy = correct / y.shape[0]

print(accuracy.numpy())
# 0.75

tf.unique

a = tf.constant([1, 2, 3, 1, 5])

tf.unique(a)

tf.unique(a)[0]
tf.unique(a)[1]
tf.gather(*tf.unique(a))

5.3 填充与复制

tf.pad

a = tf.reshape(tf.range(9), [3,3])
tf.pad(a, [[0, 0], [1, 0]])

image padding
a = tf.random.normal([4, 28, 28, 3])
b = tf.pad(a, [[0, 0], [2, 2], [2, 2], [0, 0]])
b.shape
# TensorShape([4, 32, 32, 3])

tf.tile(repeat n times)

a = tf.reshape(tf.range(9), [3, 3])
tf.tile(a, [1, 2])

tile v.s. broadcast_to
a = tf.reshape(tf.range(9), [3, 3])
aa = tf.expand_dims(a, axis=0)
aa = tf.tile(aa, [2, 1, 1])
aa

bb = tf.broadcast_to(a, [2, 3, 3])
bb

5.4 合并与分割

tf.concat

a1 = tf.ones([4, 35, 8])
b1 = tf.ones([2, 35, 8])
c1 = tf.concat([a1, b1], axis=0)
c1.shape
# TensorShape([6, 35, 8])

a2 = tf.ones([4, 35, 8])
b2 = tf.ones([4, 3, 8])
c2 = tf.concat([a2, b2], axis=1)
c2.shape
# TensorShape([4, 38, 8])

concat 要求 axis 参数外的其他维度 shape 必须匹配,否则报错:

tf.stack(creat a new dim)注:stack 要求所有维度 shape 都相等

a1 = tf.ones([4, 35, 8])
b1 = tf.ones([4, 35, 8])
c1 = tf.stack([a1, b1], axis=0)
c1.shape
# TensorShape([2, 4, 35, 8])

a2 = tf.ones([4, 35, 8])
b2 = tf.ones([4, 35, 8])
c2 = tf.stack([a2, b2], axis=3)
c2.shape
# TensorShape([4, 35, 8, 2])

stack 要求所有维度都匹配,否则报错:

tf.unstack(拆成对象个数=对应维度的 shape,对应维度消失)

a = tf.ones([4, 35, 8])
b = tf.ones([4, 35, 8])
c = tf.stack([a, b], axis=0)
c.shape
# TensorShape([2, 4, 35, 8])

res = tf.unstack(c, axis=0)
print(type(res), len(res))
# <class 'list'> 2

res = tf.unstack(c, axis=3)
print(type(res), len(res))
# <class 'list'> 8
res[0].shape
# TensorShape([2, 4, 35])
res[6].shape
# TensorShape([2, 4, 35])

tf.split

a = tf.ones([4, 35, 8])

res = tf.split(a, axis=2, num_or_size_splits=2)
print([item.shape for item in res])
# [TensorShape([4, 35, 4]), TensorShape([4, 35, 4])]
 
res = tf.split(a, axis=2, num_or_size_splits=[2, 2, 4])
print([item.shape for item in res])
# [TensorShape([4, 35, 2]), TensorShape([4, 35, 2]), TensorShape([4, 35, 4])]

5.5 排序

tf.sort, tf.argsort

一维向量
a = tf.random.shuffle(tf.range(5))
a

tf.sort(a, direction='ASCENDING')

tf.sort(a, direction='DESCENDING')

idx = tf.argsort(a, direction='DESCENDING')
idx

tf.gather(a, idx)

二维矩阵
a = tf.random.uniform([3, 3], minval=1, maxval=10, dtype=tf.int32)
a

tf.sort(a)

tf.argsort(a)

tf.sort(a, direction='DESCENDING')

tf.math.top_k

a = tf.random.uniform([5, 5], minval=1, maxval=10, dtype=tf.int32)
a

tf.math.top_k(a, 2)

TopK Accuracy
out = tf.constant([
    [0.8, 0.1, 0.1],
    [0.4, 0.5, 0.1],
    [0.2, 0.5, 0.3],
])

pred = tf.math.top_k(out, 2).indices  # [3, 2]

y = tf.constant([0, 0, 2])  # [3]
# y 与 pred 最后一个维度的shape不相同,需要对pred取转置
pred = tf.transpose(pred, [1, 0])
y = tf.broadcast_to(y, pred.shape)

bool_res = tf.equal(pred, y)
tf.reshape(bool_res, [-1])
tf.reduce_sum(tf.cast(tf.reshape(bool_res, [-1]), dtype=tf.int32), axis=0)

def accuracy(ouput, target, topk=(1,)):
    maxk = max(topk)
    n = target.shape[0]

    pred = tf.math.top_k(output, maxk).indices
    pred = tf.transpose(pred, [1, 0])
    target_ = tf.broadcast_to(target, pred.shape)
    correct = tf.equal(pred, target)

    res = []
    for k in topk:
        correct_k = tf.cast(tf.reshape(correct[:k], [-1]), dtype=tf.int32)
        acc = tf.reduce_sum(correct_k) / n
        res.append(acc)
    return res

output = tf.constant([
    [0.7, 0.06, 0.01, 0.15, 0.08],
    [0.05, 0.5, 0.1, 0.2, 0.15],
    [0.3, 0.4, 0.2, 0.05, 0.05],
])
# [3, 5]

target = tf.constant([1, 3, 2])  
# [3]

accuracy(output, target, topk=(1, 2, 3, 4, 5))
# [<tf.Tensor: shape=(), dtype=float64, numpy=0.0>,
#  <tf.Tensor: shape=(), dtype=float64, numpy=0.3333333333333333>,
#  <tf.Tensor: shape=(), dtype=float64, numpy=0.6666666666666666>,
#  <tf.Tensor: shape=(), dtype=float64, numpy=1.0>,
#  <tf.Tensor: shape=(), dtype=float64, numpy=1.0>]

5.6 限幅

tf.clip_by_value

a = tf.range(9)

tf.maximum(a, 2)
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 8])>

tf.minimum(a, 5)
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 5, 5, 5])>

tf.minimum(tf.maximum(a, 2), 5)
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 5, 5, 5])>

tf.clip_by_value(a, 2, 5)
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 5, 5, 5])>

tf.clip_by_norm

a = tf.random.normal([2, 5], mean=10)
tf.norm(a)

aa = tf.clip_by_norm(a, 15)
tf.norm(aa)

tf.cllip_by_global_norm

详细内容参考 <6. 应用>

tf.nn.relu

a = tf.range(9)-5
a
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([-5, -4, -3, -2, -1,  0,  1,  2,  3])>

tf.nn.relu(a)
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3])>

tf.maximum(a, 0)
# <tf.Tensor: shape=(9,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3])>

5.7 其他

tf.where

tf.where(mask)

返回 mask 为 True 的位置对应的 indices。

a = tf.random.normal([3, 3])

mask = a > 0
mask
tf.where(mask)

print('=====')
tf.boolean_mask(a, mask)

tf.gather_nd(a, tf.where(mask))

tf.where(cond, x, y)

cond 是一个布尔类型的张量,x 和 y 是两个形状相同的张量,用于表示在 cond 为 True 的位置选择 x 中的元素,否则选择 y 中的元素。

mask = tf.random.normal([3, 3]) > 0
mask

a = tf.ones([3, 3])
b = tf.zeros([3, 3])
tf.where(mask, a, b)

tf.scatter_nd

tf.scatter_nd(indices, updates, shape)

tf.scatter_nd 可用于将给定的值按照指定的索引散布到新的张量中。

indices 是一个N x M 的张量,表示要更新的元素的索引,N 表示要更新的元素的数量,M 表示每个索引的维度。updates是一个N x D的张量,表示要更新的值,D表示每个值的维度。shape是一个 K 维元组,表示输出张量的形状。

indices = tf.constant([[4], [2], [1], [7]])
updates = tf.constant([-1, -2, -3, -4])
shape = tf.constant([8])

tf.scatter_nd(indices, updates, shape)
# <tf.Tensor: shape=(8,), dtype=int32, numpy=array([ 0, -3, -2,  0, -1,  0,  0, -4])>
indices = tf.constant([[0], [2]])
updates = tf.constant([[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]], 
                       [[2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2]]])
shape = tf.constant([4, 4, 4])

tf.scatter_nd(indices, updates, shape)

tf.meshgrid

x = tf.linspace(-2, 2, 5)
y = tf.linspace(-2, 2, 5)
points_x, points_y = tf.meshgrid(x, y)

points_x.shape
# TensorShape([5, 5])

tf.stack([points_x, points_y], axis=2)

应用 tf.meshgrid
import matplotlib.pyplot as plt
import tensorflow as tf
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'


def func(points):
    z = tf.math.sin(points[..., 0]) + tf.math.sin(points[..., 1])
    return z


x = tf.linspace(0., 3.14*2, 100)
y = tf.linspace(0., 3.14*2, 100)
points_x, points_y = tf.meshgrid(x, y)
points = tf.stack([points_x, points_y], axis=2)
z = func(points)
print(z.shape)

plt.contour(points_x, points_y, z)
plt.colorbar()
plt.show()

plt.imshow(z, origin='lower', interpolation='none')
plt.colorbar()
plt.show()

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(points_x, points_y, z, cmap='rainbow')
plt.show()

6. 应用

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'


(x, y), _ = datasets.mnist.load_data()
x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(y, dtype=tf.int32)
print(x.shape, y.shape)
print(tf.reduce_min(x), tf.reduce_max(x))
print(tf.reduce_min(y), tf.reduce_max(y))


train_db = tf.data.Dataset.from_tensor_slices((x, y)).batch(128)
train_iter = iter(train_db)
sample = next(train_iter)
print(sample[0].shape, sample[1].shape)

# [b, 784] -> [b, 256] -> [b, 128] -> [b, 10]
# w: [dim_in, dim_out] b: [dim_out]
w1 = tf.Variable(tf.random.truncated_normal([784, 256]))
# w1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))
b1 = tf.Variable(tf.zeros([256]))

w2 = tf.Variable(tf.random.truncated_normal([256, 128]))
# w2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1))
b2 = tf.Variable(tf.zeros([128]))

w3 = tf.Variable(tf.random.truncated_normal([128, 10]))
# w3 = tf.Variable(tf.random.truncated_normal([128, 10], stddev=0.1))
b3 = tf.Variable(tf.zeros([10]))
lr = 1e-3
for epoch in range(10):
    # iterate for db
    for step, (x, y) in enumerate(train_db):
        # iterate batch
        x = tf.reshape(x, [-1, 28*28])

        with tf.GradientTape() as tape:
            # h1: [b, 784]@[784, 256] + [256] => [b, 256]
            h1 = tf.nn.relu(x@w1 + b1)
            # h2: [b, 256] => [b, 128]
            h2 = tf.nn.relu(h1@w2 + b2)
            # h3: [b, 128] => [b, 10]
            out = h2@w3 + b3

            # compute loss
            # out [b, 10]
            # y [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)
            loss = tf.reduce_mean(tf.square(y_onehot - out))

        # compute gradients
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])

        # print('===== before =====')
        # for grad in grads:
        #     print(tf.norm(grad))
        grads, _ = tf.clip_by_global_norm(grads, 15)
        # print('===== after =====')
        # for grad in grads:
        #     print(tf.norm(grad))

        # update gradients
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
        w2.assign_sub(lr * grads[2])
        b2.assign_sub(lr * grads[3])
        w3.assign_sub(lr * grads[4])
        b3.assign_sub(lr * grads[5])

        if step % 100 == 0:
            print(epoch, step, 'loss:', float(loss))

运行结果:

如果注释掉 tf.clip_by_global_norm 这行代码,重新运行,会发现输出结果中有很多 non,存在梯度爆炸问题 gradient exploding。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值