1.什么是Loss
loss被称为损失函数,用于计算出实际输出和目标之间的差距和为更新输出提供依据(反向传播)
反向传播本质是从loss反推优化方案,给卷积核设置了梯度gradient
1.L1loss
L1loss:所有实际输出和目标值差距的和 / 样本数量,要求输入输出为浮点数,在括号里加上reduction = 'sum'则不算平均数
比如:
output = ([1,2,3])
target = ([1,3,5])
L1loss = (0 + 1 + 2) / 3 = 1
output = torch.tensor([1,2,3],dtype = torch.float32)
target = torch.tensor([1,3,5],dtype = torch.float32)
loss = L1Loss()
result = loss(output , target)
print(result)
loss = L1Loss(reduction = 'sum')
result = loss(output,target)
print(result)
2.MSELoss
MSELoss:均方误差,
(0 + 1^2 + 2^2) / 3 = 1.667
loss_mse = MSELoss()
result_mse = loss_mse(output , target)
print(result_mse)
3.CrossEntropyLoss
CrossEntropyloss:交叉熵,主要用在分类问题中,尤其是有多个类别,并且交叉熵自带了softmax公式:
exp(x)为e的x次方,log为自然对数,也就是数学里的ln。
比如现在有person , dog , cat三类labels,对应下标为0,1,2。一张照片经过神经网络后,得到的三个概率分别是[0.1 , 0.7 , 0.3],记作x数组,target的值为labels的对应下标,此时为1,即dog,记作class。
则该案例里CrossEntropyLoss = -0.7 + log(exp(0.1) + exp(0.7) + exp(0.3)) = ln(e^0.1 + e^0.7 + e^0.3) - 0.7
loss_cross = nn.CrossEntropyLoss()
output = torch.tensor([0.1 , 0.7 , 0.3 ])
print(output)
target = torch.tensor([1])
output = torch.reshape(output , ([1,3])) # 前是包的数量,后是几个一包
print(output)
result_cross = loss_cross(output , target)
print(result_cross)
2.将交叉熵加入 CIFAR10神经网络
import torch
from torch import nn
from torch.nn import L1Loss,MSELoss,Sequential,Conv2d,MaxPool2d,Linear,Flatten
import torchvision
from torch.utils.data import DataLoader
class Zilliax(nn.Module):
def __init__(self):
super(Zilliax, self).__init__()
self.model = Sequential(
Conv2d(3, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 32, 5, padding=2),
MaxPool2d(2),
Conv2d(32, 64, 5, padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024, 64),
Linear(64, 10),
)
def forward(self, x):
x = self.model(x)
return x
dataset = torchvision.datasets.CIFAR10('E:\\PyCharm_Project\\Pytorch_2.3.1\\PytorchVision\\dataset', train=False,
transform=torchvision.transforms.ToTensor(), download=True)
dataloader = DataLoader(dataset, batch_size=1)
loss = nn.CrossEntropyLoss()
z = Zilliax()
for data in dataloader:
imgs, targets = data
outputs = z(imgs)
# 查看是什么类型的任务,计算出来的不是概率,是线性值
print(outputs)
print(targets)
print("____________________")
result = loss(outputs, targets) # 交叉熵自带softmax
print(result)
print("____________________")
result.backward() # debug运行,只运行到断点的上一行,找到和神经网络相关的z -> model ->protected...(受保护特性) -> module -> weight -> grad查看梯度变化,方便后续使用梯度下降
结果如下,基本交叉熵在2.4左右 :
断点调试找到梯度: