文章目录
前言
最近,需要修改yolov5推理结果,往往需要非常熟悉输出过程head的loss计算方式。特别地,对于box回归来说十分重要,到底是在原始图像尺寸进行box回归loss计算,还是在特征图尺寸计算,又如何使用将回归偏移值进行loss计算?这些问题,无不需要十分熟悉原理与源码方可透彻理解。为此,本文将记录yolov5的ComputeLoss源码解读,包含内容为透彻分析偏移值运用、target与pred预测回归使用,以及正负样本筛选源码理解。
一、ComputeLoss类整体源码说明
yolov5的ComputeLoss类一个为初始化函数,涉及参数与loss方法,一个为集成call函数,包含整个loss的计算,特别说self.build_targets函数尤为重要。我将在后面部分会详细说明,现给出非完整版ComputeLoss类的源码,如下:
class ComputeLoss:
# Compute losses
def __init__(self, model, autobalance=False):
self.sort_obj_iou = False
device = next(model.parameters()).device # get model device
...
def __call__(self, p, targets): # predictions, targets, model
device = targets.device
lcls, lbox, lobj = torch.zeros(1, device=device), torch.zeros(1, device=device), torch.zeros(1, device=device)
tcls, tbox, indices, anchors = self.build_targets(p, targets) # targets
# Losses
for i, pi in enumerate(p): # layer index, layer predictions
b, a, gj, gi = indices[i] # image, anchor, gridy, gridx
tobj = torch.zeros_like(pi[..., 0], device=device) # target obj
...
return (lbox + lobj + lcls) * bs, torch.cat((lbox, lobj, lcls)).detach()
一、ComputeLoss类初始化参数解读
ComputeLoss类的初始化函数参,我确实不太想详细说明,基本都是赋值,还是比较简单,这里需要注意就是2个self.BCEcls, self.BCEobj的loss计算方法,而box的loss计算方法在call函数中。整体类初始化参数内容如下代码:
def __init__(self, model, autobalance=False):
self.sort_obj_iou = False
device = next(model.parameters()).device # get model device
h = model.hyp # hyperparameters
# Define criteria
BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))
BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device))
# Class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
self.cp, self.cn = smooth_BCE(eps=h.get('label_smoothing', 0.0)) # positive, negative BCE targets
# Focal loss
g = h['fl_gamma'] # focal loss gamma
if g > 0:
BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
det = de_parallel(model).model[-1] # Detect() module
self.balance = {
3: [4.0, 1.0, 0.4]}.get(det.nl, [4.0, 1.0, 0.25, 0.06, 0.02]) # P3-P7
self.ssi = list(det.stride).index(16) if autobalance else 0 # stride 16 index
self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, 1.0, h, autobalance
for k in 'na', 'nc', 'nl', 'anchors':
setattr(self, k, getattr(det, k))
二、self.build_targets函数源码解读
在call函数传递参数为模型训练输出值p(不太清楚可参考我的博客这里),与真实标签targets值,谨记该值是label标签获得,特别是box值就是yolov5标签值,是一个宽高分别除以了原始图像宽高的归一化值。在搞懂build_targets值后,接下来我将介绍该函数。
1、整体源码注释解读
这一部分我给出整个函数代码,且很多地方也被我注释,可暂时查看如下内容。后面,我会对重点内容源码进行解读。
def build_targets(self, p, targets):
# Build targets for compute_loss(), input targets(image,class,x,y,w,h)
na, nt = self.na, targets.shape[0] # number of anchors, targets# 每个点anchor数量(3), targets(每个batch中的标签个数)
tcls, tbox, indices, anch = [], [], [], []# tcls表示类别,tbox表示box的坐标(x,y,w,h),indices表示图像索引,anch表示选取的anchor的索引
gain = torch.ones(7, device=targets.device).long() # normalized to gridspace gain
ai = torch.arange(na, device=targets.device).float().view(na, 1).repeat(1, nt) # same as .repeat_interleave(nt)
targets = torch.cat((targets.repeat(na, 1, 1), ai[:, :, None]), 2) # append anchor indices
# targets[image_id,class,x,y,w,h,anchor_num]
g = 0.5 # bias
off = torch.tensor([[0, 0],
[1, 0], [0, 1], [-1, 0], [0, -1], # j,k,l,m
# [1, 1], [1, -1], [-1, 1], [-1, -1], # jk,jm,lk,lm
], device=targets.device).float() * g # offsets
for i in range(self.nl): # 循环3个特征层
anchors = self.anchors[i]
gain[2:6] = torch.tensor(p[i].shape)[[3, 2, 3, 2]] # xyxy gain # xyxy gain,shape的位置3为w位置2为h,获得该特征层的wh
# Match targets to anchors
t = targets * gain# shape(3