Tensorflow④——常用TensorFlow 学习率函数、激活函数、损失函数API及代码实现

堇禤

于 2022-10-18 13:47:16 发布

阅读量1.2k

点赞数

分类专栏： Tensorflow学习笔记文章标签： tensorflow python 深度学习

本文链接：https://blog.csdn.net/CRW__DREAM/article/details/127384625

版权

Tensorflow学习笔记专栏收录该内容

7 篇文章 3 订阅

订阅专栏

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# 导入所需模块
import tensorflow as tf
from sklearn import datasets
from matplotlib import pyplot as plt
import numpy as np

一、学习率策略

tf.keras.optimizers.schedules.ExponentialDecay

tf.keras.optimizers.schedules.ExponentialDecay(
	initial_learning_rate, decay_steps, decay_rate, staircase=False, name=None
)

功能：指数衰减学习率策略.
等价API：tf.optimizers.schedules.ExponentialDecay

参数：

initial_learning_rate: 初始学习率.
decay_steps: 衰减步数, staircase为True时有效.
decay_rate: 衰减率.
staircase: Bool型变量.如果为True, 学习率呈现阶梯型下降趋势.

返回：tf.keras.optimizers.schedules.ExponentialDecay(step)返回计算得到的学习率

举例：

N = 400
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
                0.5,
                decay_steps=10,
                decay_rate=0.9,
                staircase=False)
y = []
for global_step in range(N):
    lr = lr_schedule(global_step)
    y.append(lr)
    x = range(N)
plt.figure(figsize=(8,6))
plt.plot(x, y, 'r-')
plt.ylim([0,max(plt.ylim())])
plt.xlabel('Step')
plt.ylabel('Learning Rate')
plt.title('ExponentialDecay')
plt.show()

在这里插入图片描述

tf.keras.optimizers.schedules.PiecewiseConstantDecay

tf.keras.optimizers.schedules.PiecewiseConstantDecay(
	boundaries, values, name=None
)

功能：分段常数衰减学习率策略.
等价API：tf.optimizers.schedules.PiecewiseConstantDecay

参数：

boundaries: [step_1, step_2, …, step_n]定义了在第几步进行学习率衰减.
values: [val_0, val_1, val_2, …, val_n]定义了学习率的初始值和后续衰减时的具体取值.

返回：tf.keras.optimizers.schedules.PiecewiseConstantDecay(step)返回计算得到的学习率.

N = 400
lr_schedule = tf.keras.optimizers.schedules.PiecewiseConstantDecay(
                boundaries=[100, 200, 300],
                values=[0.1, 0.05, 0.025, 0.001])
y = []
for global_step in range(N):
    lr = lr_schedule(global_step)
    y.append(lr)
x = range(N)
plt.figure(figsize=(8,6))
plt.plot(x, y, 'r-')
plt.ylim([0,max(plt.ylim())])
plt.xlabel('Step')
plt.ylabel('Learning Rate')
plt.title('PiecewiseConstantDecay')
plt.show()

在这里插入图片描述

激活函数

tf.math.sigmoid

tf.math.sigmoid(
	x, name=None
)

功能：计算x每一个元素的sigmoid值.
等价API：tf.nn.sigmoid, tf.sigmoid

参数：

x: 张量x.

返回：与x shape相同的张量.
例子：

x = tf.constant([1., 2., 3.], )
print(tf.math.sigmoid(x))

# 等价实现
print(1/(1+tf.math.exp(-x)))

tf.math.tanh

tf.math.tanh(
x, name=None
)

功能：计算x每一个元素的双曲正切值.
等价API：tf.nn.tanh, tf.tanh

参数：

x: 张量x.

返回：与x shape相同的张量.

x = tf.constant([-float("inf"), -5, -0.5, 1, 1.2, 2, 3, float("inf")])
print(tf.math.tanh(x))

# 等价实现
print((tf.math.exp(x)-tf.math.exp(-x))/(tf.math.exp(x)+tf.math.exp(-x)))

tf.nn.relu

tf.nn.relu(
features, name=None
)

功能：计算修正线性值(rectified linear)：max(features, 0).

参数：

features: 张量.

返回：与features shape相同的张量.

print(tf.nn.relu([-2., 0., -0., 3.]))

tf.nn.leaky_relu

tf.nn.leaky_relu(
features, alpha=0.2, name=None
)

功能：计算Leaky ReLU值.

参数：

features: 张量.
alpha: x<0时的斜率值.

返回：与features shape相同的张量.

tf.nn.softmax

tf.nn.softmax(
logits, axis=None, name=None
)

功能：计算softmax激活值.
等价API：tf.math.softmax

参数：

logits: 张量.
axis: 计算softmax所在的维度. 默认为-1，即最后一个维度.

返回：与logits shape相同的张量.

logits = tf.constant([4., 5., 1.])
print(tf.nn.softmax(logits))

# 等价实现
print(tf.exp(logits) / tf.reduce_sum(tf.exp(logits)))

损失函数

tf.keras.losses.MSE

tf.keras.losses.MSE(
	y_true, y_pred
)

功能：计算y_true和y_pred的均方误差.

y_true = tf.constant([0.5, 0.8])
y_pred = tf.constant([1.0, 1.0])
print(tf.keras.losses.MSE(y_true, y_pred))

# 等价实现
print(tf.reduce_mean(tf.square(y_true - y_pred)))

tf.keras.losses.categorical_crossentropy

tf.keras.losses.categorical_crossentropy(
	y_true, y_pred, from_logits=False, label_smoothing=0
)

功能：计算交叉熵.
等价API：tf.losses.categorical_crossentropy

参数：

y_true: 真实值
y_pred: 预测值.
from_logits: y_pred是否为logits张量.
label_smoothing: [0,1]之间的小数.

返回：交叉熵损失值

y_true = [1, 0, 0]
y_pred1 = [0.5, 0.4, 0.1]
y_pred2 = [0.8, 0.1, 0.1]
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred1))
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred2))

# 等价实现
print(-tf.reduce_sum(y_true * tf.math.log(y_pred1)))
print(-tf.reduce_sum(y_true * tf.math.log(y_pred2)))

tf.nn.softmax_cross_entropy_with_logits

tf.nn.softmax_cross_entropy_with_logits(
	labels, logits, axis=-1, name=None
)

功能：logits经过softmax后，与labels进行交叉熵计算.

在机器学习中，对于多分类问题，把未经softmax归一化的向量值称为logits。logits经过softmax
层后，输出服从概率分布的向量。

参数：

labels: 在类别这一维度上，每个向量应服从有效的概率分布. 例如，在labels的shape为[batch_size, num_classes]的情况下，labels[i]应服从概率分布.
logits: 每个类别的激活值，通常是线性层的输出. 激活值需要经过softmax归一化.
axis: 类别所在维度，默认是-1，即最后一个维度.

返回：softmax交叉熵损失值.

labels = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]]
print(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits))

# 等价实现
print(-tf.reduce_sum(labels * tf.math.log(tf.nn.softmax(logits)), axis=1))

tf.nn.sparse_softmax_cross_entropy_with_logits

tf.nn.sparse_softmax_cross_entropy_with_logits(
	labels, logits, name=None
)

功能：labels经过one-hot编码，logits经过softmax，两者进行交叉熵计算. 通常labels的shape为[batch_size]，logits的shape为[batch_size, num_classes]. sparse可理解为对labels进行稀疏化处理(即进行one-hot编码).

参数：

labels: 标签的索引值.
logits: 每个类别的激活值，通常是线性层的输出. 激活值需要经过softmax归一化.

返回：softmax交叉熵损失值.

（下例中先对labels进行one-hot编码为[[1,0,0], [0,1,0]]，logits经过softmax变为[[0.844，0.114，0.042], [0.007,0.976,0.018]]，两者再进行交叉熵运算）

labels = [0, 1]
logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]]
print(tf.nn.sparse_softmax_cross_entropy_with_logits(labels1, logits))

# 等价实现
print(-tf.reduce_sum(tf.one_hot(labels, tf.shape(logits)[1]) *
tf.math.log(tf.nn.softmax(logits)), axis=1))