yolov5--loss.py --v5.0版本-最新代码详细解释-2021-7-1更新

菊头蝙蝠

已于 2025-01-03 22:09:18 修改

阅读量5.6k

点赞数 29

分类专栏：计算机视觉 yolov5源码解析文章标签： pytorch 目标检测

于 2021-07-01 00:06:47 首次发布

本文链接：https://blog.csdn.net/qq_21539375/article/details/118345636

版权

yolov5--loss.py

一、简述
- 简述训练target生成的过程
- 简述代码流程
二、代码部分
三、补充内容:
- 1.为什么原图上归一化的框的坐标*特征图的大小就是在特征图上的坐标呢？
- 2.YOLOV5中(非边界格子)每个target训练的时候,除了自身的格子，还需要相邻的两个格子作回归

一、简述

yolov5–v5.0版本(最新)代码解析导航

github ultralytics/yolov5
使用的yolov5为2021年6月23号的版本v5.0

此篇作为学习笔记，也花了比较大的功夫,尽可能对每一个要点进行了解释
如有一些问题或错误，欢迎大家一起交流。

简述训练target生成的过程

涉及的部分原理，可以翻到文章最后补充内容，之所以解释不放到最前，是因为涉及了代码中的一些内容，直接可能并不明白，阅读了一遍代码，再看解释的时候可能就明白了.

简述代码流程

在train.py中

1.训练前实例化损失类

 compute_loss = ComputeLoss(model)  # init loss class

2.在训练中,调用__call__函数来返回损失

loss, loss_items = compute_loss(pred, targets.to(device))

二、代码部分

ComputeLoss中 init 部分

class ComputeLoss:
    # Compute losses
    def __init__(self, model, autobalance=False):
        super(ComputeLoss, self).__init__()
        device = next(model.parameters()).device  # get model device
        h = model.hyp  # hyperparameters

		'''
		对于目标obj损失来说无疑是用BCELoss(二分类损失)
		YOLOV5中使用的分类损失是BCEloss,对每个类别作sigmoid（而不是softmax）,
		对于每个类别来说可以是一个2分类任务.
		其中nn.BCEWithLogitsLoss自带sigmoid
		'''
        # Define criteria
        BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))
        BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device))

        '''
        对标签做平滑,eps=0就代表不做标签平滑,那么默认cp=1,cn=0
        后续对正类别赋值cp，负类别赋值cn
        '''
        # class label smoothing
        self.cp, self.cn = smooth_BCE(eps=h.get('label_smoothing', 0.0))  # positive, negative BCE targets
		
		'''
		如果g的值为0，则代表不使用focal loss
		'''
        # Focal loss
        g = h['fl_gamma']  # focal loss gamma
        if g > 0:
            BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
		
		'''
		balance用来设置三个特征图对应输出的损失系数
		从左到右对应大特征图(检测小物体)到小特征图(检测大物体),也就是说小物体的损失权重更大.
		当然模型不一定要三个特征图，其他情况比如四个五个特征图，同样可以设置相应的损失权重
		'''
        det = model.module.model[-1] if is_parallel(model) else model.model[-1]  # Detect() module
        self.balance = {
   3: [4.0, 1.0, 0.4]}.get(det.nl, [4.0, 1.0, 0.25, 0.06, .02])  # P3-P7
        self.ssi = list(det.stride).index(16) if autobalance else 0  # stride 16 index
        self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, model.gr, h, autobalance
        for k in 'na', 'nc', 'nl', 'anchors':
            setattr(self, k, getattr(det, k))

ComputeLoss中build_targets

        def build_targets(self, p, targets):
        # Build targets for compute_loss(), input targets(image,class,x,y,w,h)
        '''
        p:    List[torch.tensor * 3], p[i].shape = (b, 3, h, w, nc+5)
        targets:  targets.shape(nt, 6) , 6=icxywh ,  i表示第i张图片，c为类别，xywh为坐标
        '''
        
        na, nt = self.na, targets.shape[0]  # number of anchors, targets
        tcls, tbox, indices, anch = [], [], [], []
        
        '''
        gain是为了对后续的target(na,nt,7)中的归一化的xywh转为特征图中的网格坐标    
        其中7表示: i c x y w h ai
        '''
        gain = torch.ones(7, device=targets.device)  # normalized to gridspace gain    gain:(7)

        '''
        需要na个尺度,都进行训练，那么标签就需要复制na个, ai就代表每个尺度的索引，默认里面的值为0,1,2
        '''
        ai = torch.arange(na, device=targets.device).float().view(na, 1).repeat(1,
                                                                                nt)  # same as .repeat_interleave(nt)   ai:(na,nt)
        '''
        targets.repeat(na, 1, 1):  (na,nt,6)
        ai[:, :, None]:  (na,nt,1)  --广播--> (na,nt,1)
        targets:  (na,nt,7)    7: i c x y w h ai
        '''
        targets = torch.cat((targets.repeat(na, 1, 1), ai[:, :, None]), 2)  # append anchor indices

        g = 0.5  # bias
        off = torch.tensor([[0, 0],
                            [1, 0], [0, 1