实际目标检测回归任务中的Loss
● Smooth L1 Loss:
● L1、L2、Smooth L1作为目标检测回归Loss的缺点:
1)坐标分别计算:x、y、w、h分别回归,当成4个不同的对象处理。bbox的4个部分应该是作为一个整体讨论,但是被独立看待了。
2)不同的预测bbox具有相同的损失:把x、y、w、h独立看待,4个部分产生不同的loss会回归出不同的预测框,但是如果4个部分的总体loss相同,预测框该如何选取。
针对这些问题,引出了下面的各种IoU Loss。
(关于L1、L2、Smooth L1 Loss不太了解的可以看我的这篇文章)
1. IoU Loss
L1 、 L2以及Smooth L1 Loss 是将 bbox 四个点分别求 loss 然后相加,并没有考虑坐标之间的相关性, 而评价指标 IoU 具备相关性。上图中的第一行,所有目标的 L1 loss 都一样,但是第三个的 IoU 显然是要大于第一个,并且第三个的检测结果似乎也是好于第一个的,第二行类似,所 有目标的 L1 loss 也都一样,但 IoU 却存在差异。
基于此 IoU Loss 将 4 个点构成的 bbox 看成一个整体进行回归。
● 计算过程如下图:
● 算法流程如下:
● IoU Loss的优点:
1)它可以反映预测光与真实框的检测效果。
2)具有尺度不变性,也就是对尺度不敏感(scale invariant),满足非负性、同一性、对称性、三角不变性。
● IoU Loss存在的问题:
代码实现:
# IoU Loss
def iou(bboxes1, bboxes2):
### rows:
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
ious = torch.zeros((rows, cols))
if rows * cols == 0:
return ious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
ious = torch.zeros((cols, rows))
exchange = True
area1 = (bboxes1[:, 2] - bboxes1[:, 0]) * (
bboxes1[:, 3] - bboxes1[:, 1])
area2 = (bboxes2[:, 2] - bboxes2[:, 0]) * (
bboxes2[:, 3] - bboxes2[:, 1])
inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
inter_area = inter[:, 0] * inter[:, 1]
union = area1+area2-inter_area
ious = inter_area / union
ious = torch.clamp(ious,min=0,max = 1.0)
if exchange:
ious = ious.T
return torch.sum(1-ious)
2. GIoU Loss(G:Generalized)
● GIoU公式:
● GIoU Loss公式:
● 图示及算法流程:
其中:
算法解释:
代码实现:
# GIoU Loss
def giou(bboxes1, bboxes2):
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
ious = torch.zeros((rows, cols))
if rows * cols == 0:
return ious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
ious = torch.zeros((cols, rows))
exchange = True
area1 = (bboxes1[:, 2] - bboxes1[:, 0]) * (
bboxes1[:, 3] - bboxes1[:, 1])
area2 = (bboxes2[:, 2] - bboxes2[:, 0]) * (
bboxes2[:, 3] - bboxes2[:, 1])
inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])
out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])
out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])
inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
inter_area = inter[:, 0] * inter[:, 1]
outer = torch.clamp((out_max_xy - out_min_xy), min=0)
outer_area = outer[:, 0] * outer[:, 1]
union = area1+area2-inter_area
closure = outer_area
ious = inter_area / union - (closure - union) / closure
ious = torch.clamp(ious,min=-1.0,max = 1.0)
if exchange:
ious = ious.T
return torch.sum(1-ious)
3. DIoU Loss(D:Distance)
● DIoU公式:
● DIoU Loss公式:
公式解释:
其中,ܾ、ܾ݃b、b_gt分别代表了预测框和真实框的中心点,且ρ代表的是计算两个中心点间的 欧式距离,c 代表的是能够同时包含预测框和真实框的最小闭包区域的对角线的距离。
代码实现:
# DIoU Loss
def diou(bboxes1, bboxes2):
# this is from official website:
# https://github.com/Zzh-tju/CIoU/blob/master/layers/modules/multibox_loss.py
bboxes1 = torch.sigmoid(bboxes1) # make sure the input belongs to [0, 1]
bboxes2 = torch.sigmoid(bboxes2)
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
cious = torch.zeros((rows, cols))
if rows * cols == 0:
return cious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
cious = torch.zeros((cols, rows))
exchange = True
w1 = torch.exp(bboxes1[:, 2]) # this means this bbox has been encoded by log
h1 = torch.exp(bboxes1[:, 3]) # you needn't do this if your bboxes are not encoded
w2 = torch.exp(bboxes2[:, 2])
h2 = torch.exp(bboxes2[:, 3])
area1 = w1 * h1
area2 = w2 * h2
center_x1 = bboxes1[:, 0]
center_y1 = bboxes1[:, 1]
center_x2 = bboxes2[:, 0]
center_y2 = bboxes2[:, 1]
inter_l = torch.max(center_x1 - w1 / 2, center_x2 - w2 / 2)
inter_r = torch.min(center_x1 + w1 / 2, center_x2 + w2 / 2)
inter_t = torch.max(center_y1 - h1 / 2, center_y2 - h2 / 2)
inter_b = torch.min(center_y1 + h1 / 2, center_y2 + h2 / 2)
inter_area = torch.clamp((inter_r - inter_l),min=0) * torch.clamp((inter_b - inter_t),min=0)
c_l = torch.min(center_x1 - w1 / 2, center_x2 - w2 / 2)
c_r = torch.max(center_x1 + w1 / 2, center_x2 + w2 / 2)
c_t = torch.min(center_y1 - h1 / 2, center_y2 - h2 / 2)
c_b = torch.max(center_y1 + h1 / 2, center_y2 + h2 / 2)
inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
c_diag = torch.clamp((c_r - c_l), min=0)**2 + torch.clamp((c_b - c_t), min=0)**2
union = area1+area2-inter_area
u = (inter_diag) / c_diag
iou = inter_area / union
dious = iou - u
dious = torch.clamp(dious, min=-1.0, max=1.0)
if exchange:
dious = dious.T
return torch.sum(1 - dious)
4. CIoU Loss
代码实现:
# DIoU Loss
def diou(bboxes1, bboxes2):
# this is from official website:
# https://github.com/Zzh-tju/CIoU/blob/master/layers/modules/multibox_loss.py
bboxes1 = torch.sigmoid(bboxes1) # make sure the input belongs to [0, 1]
bboxes2 = torch.sigmoid(bboxes2)
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
cious = torch.zeros((rows, cols))
if rows * cols == 0:
return cious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
cious = torch.zeros((cols, rows))
exchange = True
w1 = torch.exp(bboxes1[:, 2]) # this means this bbox has been encoded by log
h1 = torch.exp(bboxes1[:, 3]) # you needn't do this if your bboxes are not encoded
w2 = torch.exp(bboxes2[:, 2])
h2 = torch.exp(bboxes2[:, 3])
area1 = w1 * h1
area2 = w2 * h2
center_x1 = bboxes1[:, 0]
center_y1 = bboxes1[:, 1]
center_x2 = bboxes2[:, 0]
center_y2 = bboxes2[:, 1]
inter_l = torch.max(center_x1 - w1 / 2, center_x2 - w2 / 2)
inter_r = torch.min(center_x1 + w1 / 2, center_x2 + w2 / 2)
inter_t = torch.max(center_y1 - h1 / 2, center_y2 - h2 / 2)
inter_b = torch.min(center_y1 + h1 / 2, center_y2 + h2 / 2)
inter_area = torch.clamp((inter_r - inter_l),min=0) * torch.clamp((inter_b - inter_t),min=0)
c_l = torch.min(center_x1 - w1 / 2, center_x2 - w2 / 2)
c_r = torch.max(center_x1 + w1 / 2, center_x2 + w2 / 2)
c_t = torch.min(center_y1 - h1 / 2, center_y2 - h2 / 2)
c_b = torch.max(center_y1 + h1 / 2, center_y2 + h2 / 2)
inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
c_diag = torch.clamp((c_r - c_l), min=0)**2 + torch.clamp((c_b - c_t), min=0)**2
union = area1+area2-inter_area
u = (inter_diag) / c_diag
iou = inter_area / union
dious = iou - u
dious = torch.clamp(dious, min=-1.0, max=1.0)
if exchange:
dious = dious.T
return torch.sum(1 - dious)