【目标检测】IoU、GIoU、DIoU、CIoU Loss详解及代码实现

最新推荐文章于 2024-07-25 04:41:01 发布

姚路遥遥

最新推荐文章于 2024-07-25 04:41:01 发布

阅读量5k

点赞数 3

分类专栏：目标检测文章标签：深度学习人工智能计算机视觉神经网络

本文链接：https://blog.csdn.net/Roaddd/article/details/114804756

版权

目标检测专栏收录该内容

13 篇文章 3 订阅

订阅专栏

实际目标检测回归任务中的Loss

● Smooth L1 Loss：

在这里插入图片描述

● L1、L2、Smooth L1作为目标检测回归Loss的缺点：

1）坐标分别计算：x、y、w、h分别回归，当成4个不同的对象处理。bbox的4个部分应该是作为一个整体讨论，但是被独立看待了。
2）不同的预测bbox具有相同的损失：把x、y、w、h独立看待，4个部分产生不同的loss会回归出不同的预测框，但是如果4个部分的总体loss相同，预测框该如何选取。

针对这些问题，引出了下面的各种IoU Loss。

（关于L1、L2、Smooth L1 Loss不太了解的可以看我的这篇文章）

1. IoU Loss

在这里插入图片描述
L1 、 L2以及Smooth L1 Loss 是将 bbox 四个点分别求 loss 然后相加，并没有考虑坐标之间的相关性，而评价指标 IoU 具备相关性。上图中的第一行，所有目标的 L1 loss 都一样，但是第三个的 IoU 显然是要大于第一个，并且第三个的检测结果似乎也是好于第一个的，第二行类似，所有目标的 L1 loss 也都一样，但 IoU 却存在差异。
基于此 IoU Loss 将 4 个点构成的 bbox 看成一个整体进行回归。

● 计算过程如下图：

在这里插入图片描述
● 算法流程如下：

在这里插入图片描述

● IoU Loss的优点：

1）它可以反映预测光与真实框的检测效果。
2）具有尺度不变性，也就是对尺度不敏感（scale invariant），满足非负性、同一性、对称性、三角不变性。

● IoU Loss存在的问题：

在这里插入图片描述

代码实现：

# IoU Loss
def iou(bboxes1, bboxes2):
    ### rows:
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    ious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return ious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        ious = torch.zeros((cols, rows))
        exchange = True
    area1 = (bboxes1[:, 2] - bboxes1[:, 0]) * (
        bboxes1[:, 3] - bboxes1[:, 1])
    area2 = (bboxes2[:, 2] - bboxes2[:, 0]) * (
        bboxes2[:, 3] - bboxes2[:, 1])

    inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])
    inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])

    inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
    inter_area = inter[:, 0] * inter[:, 1]
    union = area1+area2-inter_area
    ious = inter_area / union
    ious = torch.clamp(ious,min=0,max = 1.0)
    if exchange:
        ious = ious.T
    return torch.sum(1-ious)

2. GIoU Loss（G：Generalized）

● GIoU公式：

在这里插入图片描述

● GIoU Loss公式：

在这里插入图片描述
● 图示及算法流程：

在这里插入图片描述

其中：

算法解释：

在这里插入图片描述
代码实现：

# GIoU Loss
def giou(bboxes1, bboxes2):
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    ious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return ious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        ious = torch.zeros((cols, rows))
        exchange = True
    area1 = (bboxes1[:, 2] - bboxes1[:, 0]) * (
        bboxes1[:, 3] - bboxes1[:, 1])
    area2 = (bboxes2[:, 2] - bboxes2[:, 0]) * (
        bboxes2[:, 3] - bboxes2[:, 1])

    inter_max_xy = torch.min(bboxes1[:, 2:],bboxes2[:, 2:])

    inter_min_xy = torch.max(bboxes1[:, :2],bboxes2[:, :2])

    out_max_xy = torch.max(bboxes1[:, 2:],bboxes2[:, 2:])

    out_min_xy = torch.min(bboxes1[:, :2],bboxes2[:, :2])

    inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
    inter_area = inter[:, 0] * inter[:, 1]
    outer = torch.clamp((out_max_xy - out_min_xy), min=0)
    outer_area = outer[:, 0] * outer[:, 1]
    union = area1+area2-inter_area
    closure = outer_area

    ious = inter_area / union - (closure - union) / closure
    ious = torch.clamp(ious,min=-1.0,max = 1.0)
    if exchange:
        ious = ious.T
    return torch.sum(1-ious)

3. DIoU Loss（D：Distance）

● DIoU公式：

在这里插入图片描述
● DIoU Loss公式：

在这里插入图片描述
公式解释：
其中，ܾ、ܾ݃b、b_gt分别代表了预测框和真实框的中心点，且ρ代表的是计算两个中心点间的欧式距离，c 代表的是能够同时包含预测框和真实框的最小闭包区域的对角线的距离。

代码实现：

# DIoU Loss
def diou(bboxes1, bboxes2):
    # this is from official website:
    # https://github.com/Zzh-tju/CIoU/blob/master/layers/modules/multibox_loss.py
    bboxes1 = torch.sigmoid(bboxes1)        # make sure the input belongs to [0, 1]
    bboxes2 = torch.sigmoid(bboxes2)
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    cious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return cious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        cious = torch.zeros((cols, rows))
        exchange = True
    w1 = torch.exp(bboxes1[:, 2])       # this means this bbox has been encoded by log
    h1 = torch.exp(bboxes1[:, 3])       # you needn't do this if your bboxes are not encoded
    w2 = torch.exp(bboxes2[:, 2])
    h2 = torch.exp(bboxes2[:, 3])
    area1 = w1 * h1
    area2 = w2 * h2
    center_x1 = bboxes1[:, 0]
    center_y1 = bboxes1[:, 1]
    center_x2 = bboxes2[:, 0]
    center_y2 = bboxes2[:, 1]

    inter_l = torch.max(center_x1 - w1 / 2, center_x2 - w2 / 2)
    inter_r = torch.min(center_x1 + w1 / 2, center_x2 + w2 / 2)
    inter_t = torch.max(center_y1 - h1 / 2, center_y2 - h2 / 2)
    inter_b = torch.min(center_y1 + h1 / 2, center_y2 + h2 / 2)
    inter_area = torch.clamp((inter_r - inter_l),min=0) * torch.clamp((inter_b - inter_t),min=0)

    c_l = torch.min(center_x1 - w1 / 2, center_x2 - w2 / 2)
    c_r = torch.max(center_x1 + w1 / 2, center_x2 + w2 / 2)
    c_t = torch.min(center_y1 - h1 / 2, center_y2 - h2 / 2)
    c_b = torch.max(center_y1 + h1 / 2, center_y2 + h2 / 2)

    inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
    c_diag = torch.clamp((c_r - c_l), min=0)**2 + torch.clamp((c_b - c_t), min=0)**2

    union = area1+area2-inter_area
    u = (inter_diag) / c_diag
    iou = inter_area / union
    dious = iou - u
    dious = torch.clamp(dious, min=-1.0, max=1.0)
    if exchange:
        dious = dious.T
    return torch.sum(1 - dious)

4. CIoU Loss

代码实现：

# DIoU Loss
def diou(bboxes1, bboxes2):
    # this is from official website:
    # https://github.com/Zzh-tju/CIoU/blob/master/layers/modules/multibox_loss.py
    bboxes1 = torch.sigmoid(bboxes1)        # make sure the input belongs to [0, 1]
    bboxes2 = torch.sigmoid(bboxes2)
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    cious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return cious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        cious = torch.zeros((cols, rows))
        exchange = True
    w1 = torch.exp(bboxes1[:, 2])       # this means this bbox has been encoded by log
    h1 = torch.exp(bboxes1[:, 3])       # you needn't do this if your bboxes are not encoded
    w2 = torch.exp(bboxes2[:, 2])
    h2 = torch.exp(bboxes2[:, 3])
    area1 = w1 * h1
    area2 = w2 * h2
    center_x1 = bboxes1[:, 0]
    center_y1 = bboxes1[:, 1]
    center_x2 = bboxes2[:, 0]
    center_y2 = bboxes2[:, 1]

    inter_l = torch.max(center_x1 - w1 / 2, center_x2 - w2 / 2)
    inter_r = torch.min(center_x1 + w1 / 2, center_x2 + w2 / 2)
    inter_t = torch.max(center_y1 - h1 / 2, center_y2 - h2 / 2)
    inter_b = torch.min(center_y1 + h1 / 2, center_y2 + h2 / 2)
    inter_area = torch.clamp((inter_r - inter_l),min=0) * torch.clamp((inter_b - inter_t),min=0)

    c_l = torch.min(center_x1 - w1 / 2, center_x2 - w2 / 2)
    c_r = torch.max(center_x1 + w1 / 2, center_x2 + w2 / 2)
    c_t = torch.min(center_y1 - h1 / 2, center_y2 - h2 / 2)
    c_b = torch.max(center_y1 + h1 / 2, center_y2 + h2 / 2)

    inter_diag = (center_x2 - center_x1)**2 + (center_y2 - center_y1)**2
    c_diag = torch.clamp((c_r - c_l), min=0)**2 + torch.clamp((c_b - c_t), min=0)**2

    union = area1+area2-inter_area
    u = (inter_diag) / c_diag
    iou = inter_area / union
    dious = iou - u
    dious = torch.clamp(dious, min=-1.0, max=1.0)
    if exchange:
        dious = dious.T
    return torch.sum(1 - dious)