损失函数【tensorflow笔记-CH2.4】

最新推荐文章于 2022-09-17 15:06:55 发布

wykup

最新推荐文章于 2022-09-17 15:06:55 发布

阅读量295

点赞数 1

分类专栏： AI

本文链接：https://blog.csdn.net/wyk71456/article/details/108908431

版权

AI 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

损失函数loss

损失函数loss是预测值(y)和已知答案(y_)的差距
在这里插入图片描述

其中yi是一个batch中第i个数据的真实值，而yi’是NN的预测值。
使用例子：

y_true = tf.constant([0.5, 0.8])
y_pred = tf.constant([1.0, 1.0])
print(tf.keras.losses.MSE(y_true, y_pred))

运行结果：

>>> tf.Tensor(0.145, shape=(), dtype=float32)

等价实现：

print(tf.reduce_mean(tf.square(y_true - y_pred)))

运行结果：

>>> tf.Tensor(0.145, shape=(), dtype=float32)

例子：预测酸奶日销量
采集数据：每日x1，x2和销量y_（y_=x1+x2），噪声-0.05~0.05，拟合可以预测销量的函数。
把这套数据集喂入神经网络，构造一个一层的神经网络，预测酸奶日销量。

import tensorflow as tf
import numpy as np

SEED = 23455  #随机种子

rdm = np.random.RandomState(seed=SEED)  # 生成[0,1)之间的随机数
x = rdm.rand(32, 2)	#生成32行2列的输入特征x，包含32组0-1之间的随机数x1和x2
y_ = [[x1 + x2 + (rdm.rand() / 10.0 - 0.05)] for (x1, x2) in x] 
 # 生成噪声[0,1)/10=[0,0.1); [0,0.1)-0.05=[-0.05,0.05)
x = tf.cast(x, dtype=tf.float32)

w1 = tf.Variable(tf.random.normal([2, 1], stddev=1, seed=1))
#随机初始化参数w1，初始化为2行1列

训练部分

epoch = 15000
#数据集迭代15000次
lr = 0.002
#学习率
#for循环中用with结构求前向传播结果y
for epoch in range(epoch):
    with tf.GradientTape() as tape:
        y = tf.matmul(x, w1)
        loss_mse = tf.reduce_mean(tf.square(y_ - y))
		#求均方误差损失函数loss_mse
		
    grads = tape.gradient(loss_mse, w1)
    #损失函数对带训练参数w1求偏导
    w1.assign_sub(lr * grads)
	#更新参数w1

	#每迭代500轮打印一次w1
    if epoch % 500 == 0:
        print("After %d training steps,w1 is " % (epoch))
        print(w1.numpy(), "\n")
print("Final w1 is: ", w1.numpy())

运行结果

After 0 training steps,w1 is 
[[-0.8096241]
 [ 1.4855157]] 

After 500 training steps,w1 is 
[[-0.21934733]
 [ 1.6984866 ]] 

After 1000 training steps,w1 is 
[[0.0893971]
 [1.673225 ]] 

After 1500 training steps,w1 is 
[[0.28368822]
 [1.5853055 ]] 

After 2000 training steps,w1 is 
[[0.423243 ]
 [1.4906037]] 

After 2500 training steps,w1 is 
[[0.531055 ]
 [1.4053345]] 

After 3000 training steps,w1 is 
[[0.61725086]
 [1.332841  ]] 

After 3500 training steps,w1 is 
[[0.687201 ]
 [1.2725208]] 

After 4000 training steps,w1 is 
[[0.7443262]
 [1.2227542]] 

After 4500 training steps,w1 is 
[[0.7910986]
 [1.1818361]] 

After 5000 training steps,w1 is 
[[0.82943517]
 [1.1482395 ]] 

After 5500 training steps,w1 is 
[[0.860872 ]
 [1.1206709]] 

After 6000 training steps,w1 is 
[[0.88665503]
 [1.098054  ]] 

After 6500 training steps,w1 is 
[[0.90780276]
 [1.0795006 ]] 

After 7000 training steps,w1 is 
[[0.92514884]
 [1.0642821 ]] 

After 7500 training steps,w1 is 
[[0.93937725]
 [1.0517985 ]] 

After 8000 training steps,w1 is 
[[0.951048]
 [1.041559]] 

After 8500 training steps,w1 is 
[[0.96062106]
 [1.0331597 ]] 

After 9000 training steps,w1 is 
[[0.9684733]
 [1.0262702]] 

After 9500 training steps,w1 is 
[[0.97491425]
 [1.0206193 ]] 

After 10000 training steps,w1 is 
[[0.9801975]
 [1.0159837]] 

After 10500 training steps,w1 is 
[[0.9845312]
 [1.0121814]] 

After 11000 training steps,w1 is 
[[0.9880858]
 [1.0090628]] 

After 11500 training steps,w1 is 
[[0.99100184]
 [1.0065047 ]] 

After 12000 training steps,w1 is 
[[0.9933934]
 [1.0044063]] 

After 12500 training steps,w1 is 
[[0.9953551]
 [1.0026854]] 

After 13000 training steps,w1 is 
[[0.99696386]
 [1.0012728 ]] 

After 13500 training steps,w1 is 
[[0.9982835]
 [1.0001147]] 

After 14000 training steps,w1 is 
[[0.9993659]
 [0.999166 ]] 

After 14500 training steps,w1 is 
[[1.0002553 ]
 [0.99838644]] 

Final w1 is:  [[1.0009792]
 [0.9977485]]

交叉熵损失函数CE(Cross Entropy)

交叉熵（Cross Entropy）表征两个概率分布之间的距离，交叉熵越大说明二者分布越远，交叉熵越小说明二者分布越接近，是分类问题中使用较广泛的损失函数。
在这里插入图片描述
其中代表数据的真实值，代表神经网络的预测值。对于多分类问题，神经网络的输出一般不是概率分布，因此需要引入softmax层，使得输出服从概率分布。

tf.keras.losses.categorical_crossentropy
功能：计算交叉熵.
等价API：tf.losses.categorical_crossentropy
例子：

y_true = [1, 0, 0]
y_pred1 = [0.5, 0.4, 0.1]
y_pred2 = [0.8, 0.1, 0.1]
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred1))
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred2))

结果

>>> tf.Tensor(0.6931472, shape=(), dtype=float32)
tf.Tensor(0.22314353, shape=(), dtype=float32)

等价实现:

print(-tf.reduce_sum(y_true * tf.math.log(y_pred1)))
print(-tf.reduce_sum(y_true * tf.math.log(y_pred2)))

结果

>>> tf.Tensor(0.6931472, shape=(), dtype=float32)
tf.Tensor(0.22314353, shape=(), dtype=float32)

同时计算softmax和交叉熵

tf.nn.softmax_cross_entropy_with_logits(labels, logits, axis=-1, name=None)

功能：logits经过softmax后，与labels进行交叉熵计算.
在机器学习中，对于多分类问题，把未经softmax归一化的向量值称为logits。logits经过softmax层后，输出服从概率分布的向量。
参数：
labels: 在类别这一维度上，每个向量应服从有效的概率分布. 例如，在labels的shape为
[batch_size, num_classes]的情况下，labels[i]应服从概率分布.
logits: 每个类别的激活值，通常是线性层的输出. 激活值需要经过softmax归一化.
axis: 类别所在维度，默认是-1，即最后一个维度.
返回：softmax交叉熵损失值
例子：

abels = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]]
print(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits))

>>> tf.Tensor([0.16984604 0.02474492], shape=(2,), dtype=float32)

等价实现:

print(-tf.reduce_sum(labels * tf.math.log(tf.nn.softmax(logits)), axis=1))

>>> tf.Tensor([0.16984606 0.02474495], shape=(2,), dtype=float32)

wykup

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录