损失函数【tensorflow笔记-CH2.4】

损失函数loss

损失函数loss是预测值(y)和已知答案(y_)的差距
在这里插入图片描述
在这里插入图片描述
其中yi是一个batch中第i个数据的真实值,而yi’是NN的预测值。
使用例子:

y_true = tf.constant([0.5, 0.8])
y_pred = tf.constant([1.0, 1.0])
print(tf.keras.losses.MSE(y_true, y_pred))

运行结果:

>>> tf.Tensor(0.145, shape=(), dtype=float32)

等价实现:

print(tf.reduce_mean(tf.square(y_true - y_pred)))

运行结果:

>>> tf.Tensor(0.145, shape=(), dtype=float32)

例子:预测酸奶日销量
采集数据:每日x1,x2和销量y_(y_=x1+x2),噪声-0.05~0.05,拟合可以预测销量的函数。
把这套数据集喂入神经网络,构造一个一层的神经网络,预测酸奶日销量。

import tensorflow as tf
import numpy as np

SEED = 23455  #随机种子

rdm = np.random.RandomState(seed=SEED)  # 生成[0,1)之间的随机数
x = rdm.rand(32, 2)	#生成32行2列的输入特征x,包含32组0-1之间的随机数x1和x2
y_ = [[x1 + x2 + (rdm.rand() / 10.0 - 0.05)] for (x1, x2) in x] 
 # 生成噪声[0,1)/10=[0,0.1); [0,0.1)-0.05=[-0.05,0.05)
x = tf.cast(x, dtype=tf.float32)

w1 = tf.Variable(tf.random.normal([2, 1], stddev=1, seed=1))
#随机初始化参数w1,初始化为2行1列

训练部分

epoch = 15000
#数据集迭代15000次
lr = 0.002
#学习率
#for循环中用with结构求前向传播结果y
for epoch in range(epoch):
    with tf.GradientTape() as tape:
        y = tf.matmul(x, w1)
        loss_mse = tf.reduce_mean(tf.square(y_ - y))
		#求均方误差损失函数loss_mse
		
    grads = tape.gradient(loss_mse, w1)
    #损失函数对带训练参数w1求偏导
    w1.assign_sub(lr * grads)
	#更新参数w1

	#每迭代500轮打印一次w1
    if epoch % 500 == 0:
        print("After %d training steps,w1 is " % (epoch))
        print(w1.numpy(), "\n")
print("Final w1 is: ", w1.numpy())

运行结果

After 0 training steps,w1 is 
[[-0.8096241]
 [ 1.4855157]] 

After 500 training steps,w1 is 
[[-0.21934733]
 [ 1.6984866 ]] 

After 1000 training steps,w1 is 
[[0.0893971]
 [1.673225 ]] 

After 1500 training steps,w1 is 
[[0.28368822]
 [1.5853055 ]] 

After 2000 training steps,w1 is 
[[0.423243 ]
 [1.4906037]] 

After 2500 training steps,w1 is 
[[0.531055 ]
 [1.4053345]] 

After 3000 training steps,w1 is 
[[0.61725086]
 [1.332841  ]] 

After 3500 training steps,w1 is 
[[0.687201 ]
 [1.2725208]] 

After 4000 training steps,w1 is 
[[0.7443262]
 [1.2227542]] 

After 4500 training steps,w1 is 
[[0.7910986]
 [1.1818361]] 

After 5000 training steps,w1 is 
[[0.82943517]
 [1.1482395 ]] 

After 5500 training steps,w1 is 
[[0.860872 ]
 [1.1206709]] 

After 6000 training steps,w1 is 
[[0.88665503]
 [1.098054  ]] 

After 6500 training steps,w1 is 
[[0.90780276]
 [1.0795006 ]] 

After 7000 training steps,w1 is 
[[0.92514884]
 [1.0642821 ]] 

After 7500 training steps,w1 is 
[[0.93937725]
 [1.0517985 ]] 

After 8000 training steps,w1 is 
[[0.951048]
 [1.041559]] 

After 8500 training steps,w1 is 
[[0.96062106]
 [1.0331597 ]] 

After 9000 training steps,w1 is 
[[0.9684733]
 [1.0262702]] 

After 9500 training steps,w1 is 
[[0.97491425]
 [1.0206193 ]] 

After 10000 training steps,w1 is 
[[0.9801975]
 [1.0159837]] 

After 10500 training steps,w1 is 
[[0.9845312]
 [1.0121814]] 

After 11000 training steps,w1 is 
[[0.9880858]
 [1.0090628]] 

After 11500 training steps,w1 is 
[[0.99100184]
 [1.0065047 ]] 

After 12000 training steps,w1 is 
[[0.9933934]
 [1.0044063]] 

After 12500 training steps,w1 is 
[[0.9953551]
 [1.0026854]] 

After 13000 training steps,w1 is 
[[0.99696386]
 [1.0012728 ]] 

After 13500 training steps,w1 is 
[[0.9982835]
 [1.0001147]] 

After 14000 training steps,w1 is 
[[0.9993659]
 [0.999166 ]] 

After 14500 training steps,w1 is 
[[1.0002553 ]
 [0.99838644]] 

Final w1 is:  [[1.0009792]
 [0.9977485]]

交叉熵损失函数CE(Cross Entropy)

交叉熵(Cross Entropy)表征两个概率分布之间的距离,交叉熵越大说明二者分布越远,交叉熵越小说明二者分布越接近,是分类问题中使用较广泛的损失函数。
在这里插入图片描述
其中 代表数据的真实值, 代表神经网络的预测值。对于多分类问题,神经网络的输出一般不是概率分布,因此需要引入softmax层,使得输出服从概率分布。

tf.keras.losses.categorical_crossentropy
功能:计算交叉熵.
等价API:tf.losses.categorical_crossentropy
例子:

y_true = [1, 0, 0]
y_pred1 = [0.5, 0.4, 0.1]
y_pred2 = [0.8, 0.1, 0.1]
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred1))
print(tf.keras.losses.categorical_crossentropy(y_true, y_pred2))

结果

>>> tf.Tensor(0.6931472, shape=(), dtype=float32)
tf.Tensor(0.22314353, shape=(), dtype=float32)

等价实现:

print(-tf.reduce_sum(y_true * tf.math.log(y_pred1)))
print(-tf.reduce_sum(y_true * tf.math.log(y_pred2)))

结果

>>> tf.Tensor(0.6931472, shape=(), dtype=float32)
tf.Tensor(0.22314353, shape=(), dtype=float32)

同时计算softmax和交叉熵

tf.nn.softmax_cross_entropy_with_logits(labels, logits, axis=-1, name=None)

功能:logits经过softmax后,与labels进行交叉熵计算.
在机器学习中,对于多分类问题,把未经softmax归一化的向量值称为logits。logits经过softmax层后,输出服从概率分布的向量。
参数:
labels: 在类别这一维度上,每个向量应服从有效的概率分布. 例如,在labels的shape为
[batch_size, num_classes]的情况下,labels[i]应服从概率分布.
logits: 每个类别的激活值,通常是线性层的输出. 激活值需要经过softmax归一化.
axis: 类别所在维度,默认是-1,即最后一个维度.
返回:softmax交叉熵损失值
例子:

abels = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]
logits = [[4.0, 2.0, 1.0], [0.0, 5.0, 1.0]]
print(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits))
>>> tf.Tensor([0.16984604 0.02474492], shape=(2,), dtype=float32)

等价实现:

print(-tf.reduce_sum(labels * tf.math.log(tf.nn.softmax(logits)), axis=1))
>>> tf.Tensor([0.16984606 0.02474495], shape=(2,), dtype=float32)
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值