YOLOv11改进策略【损失函数篇】| 通过辅助边界框计算IoU提升检测效果(Inner_GIoU、Inner_DIoU、Inner_CIoU、Inner_EIoU、Inner_SIoU)

一、背景:

  • 现有基于IoU的边界框回归方法主要通过添加新的损失项来加速收敛,忽略了IoU损失项本身的局限性,且在不同检测器和检测任务中不能自我调整,泛化性不强。
  • 通过分析边界框回归模型,inner_iou论文中发现区分不同的回归样本,并使用不同尺度的辅助边界框来计算损失,可以有效加速边界框回归过程。对于高IoU样本,使用较小的辅助边界框计算损失可加速收敛,而较大的辅助边界框适用于低IoU样本。

本文将YOLOv11默认的CIoU损失函数修改成inner_IoUinner_GIoUinner_DIoUinner_CIoUinner_EIoUinner_SIoU


二、原理

Inner-IoU: More Effective Intersection over Union Loss with Auxiliary Bounding Box

2.1 Inner - IoU计算原理

  1. 定义相关参数:
    • 真实(GT)框和锚点分别表示为 B g t B^{gt} Bgt B B B
    • GT框和内GT框的中心点表示为 ( x c g t , y c g t ) (x_{c}^{gt}, y_{c}^{gt}) (xcgt,ycgt),锚点和内锚点的中心点表示为 ( x c , y c ) (x_{c}, y_{c}) (xc,yc)
    • GT框的宽度和高度表示为 w g t w^{gt} wgt h g t h^{gt} hgt,锚点的宽度和高度表示为 w w w h h h
    • 引入比例因子ratio
  • 根据以下公式计算辅助边界框的坐标:
    • b l g t = x c g t − w g t ∗ r a t i o 2 b_{l}^{g t} = x_{c}^{g t} - \frac{w^{g t} * ratio}{2} blgt=xcgt2wgtratio b r g t = x c g t + w g t ∗ r a t i o 2 b_{r}^{g t} = x_{c}^{g t} + \frac{w^{g t} * ratio}{2} brgt=xcgt+2wgtratio
    • b t g t = y c g t − h g t ∗ r a t i o 2 b_{t}^{g t} = y_{c}^{g t} - \frac{h^{g t} * ratio}{2} btgt=ycgt2hgtratio b b g t = y c g t + h g t ∗ r a t i o 2 b_{b}^{g t} = y_{c}^{g t} + \frac{h^{g t} * ratio}{2} bbgt=ycgt+2hgtratio
    • b l = x c − w ∗ r a t i o 2 b_{l} = x_{c} - \frac{w * ratio}{2} bl=xc2wratio b r = x c + w ∗ r a t i o 2 b_{r} = x_{c} + \frac{w * ratio}{2} br=xc+2wratio
    • b t = y c − h ∗ r a t i o 2 b_{t} = y_{c} - \frac{h * ratio}{2} bt=yc2hratio b b = y c + h ∗ r a t i o 2 b_{b} = y_{c} + \frac{h * ratio}{2} bb=yc+2hratio
  • 计算交并比:
    • i n t e r = ( m i n ( b r g t , b r ) − m a x ( b l g t , b l ) ) ∗ ( m i n ( b b g t , b b ) − m a x ( b t g t , b t ) ) inter = (min(b_{r}^{g t}, b_{r}) - max(b_{l}^{g t}, b_{l})) * (min(b_{b}^{g t}, b_{b}) - max(b_{t}^{g t}, b_{t})) inter=(min(brgt,br)max(blgt,bl))(min(bbgt,bb)max(btgt,bt))
    • u n i o n = ( w g t ∗ h g t ) ∗ ( r a t i o ) 2 + ( w ∗ h ) ∗ ( r a t i o ) 2 − i n t e r union = (w^{g t} * h^{g t}) * (ratio)^{2} + (w * h) * (ratio)^{2} - inter union=(wgthgt)(ratio)2+(wh)(ratio)2inter
    • I o U i n n e r = i n t e r u n i o n IoU^{inner} = \frac{inter}{union} IoUinner=unioninter
  • Inner - IoU损失的计算公式为: L I n n e r − I o U = 1 − I o U i n n e r L_{Inner - IoU} = 1 - IoU^{inner} LInnerIoU=1IoUinner
  • Inner - IoU应用于现有基于IoU的边界框回归损失函数,得到:
    • L I n n e r − G I o U = L G I o U + I o U − I o U i n n e r L_{Inner - GIoU} = L_{GIoU} + IoU - IoU^{inner} LInnerGIoU=LGIoU+IoUIoUinner
    • L I n n e r − D I o U = L D I o U + I o U − I o U i n n e r L_{Inner - DIoU} = L_{DIoU} + IoU - IoU^{inner} LInnerDIoU=LDIoU+IoUIoUinner
    • L I n n e r − C I o U = L C I o U + I o U − I o U i n n e r L_{Inner - CIoU} = L_{CIoU} + IoU - IoU^{inner} LInnerCIoU=LCIoU+IoUIoUinner
    • L I n n e r − E I o U = L E I o U + I o U − I o U i n n e r L_{Inner - EIoU} = L_{EIoU} + IoU - IoU^{inner} LInnerEIoU=LEIoU+IoUIoUinner
    • L I n n e r − S I o U = L S I o U + I o U − I o U i n n e r L_{Inner - SIoU} = L_{SIoU} + IoU - IoU^{inner} LInnerSIoU=LSIoU+IoUIoUinner

在这里插入图片描述

根据文章内容,在Inner - IoU损失中,比例因子ratio通常在 [0.5, 1.5] 范围内进行调整。

对于高IoU样本,为了加速其回归,将比例因子设置为小于1的值,使用较小的辅助边界框计算损失。例如在模拟实验中,为加速高IoU样本的回归,将比例因子ratio设置为0.8

对于低IoU样本,为了加速其回归过程,将比例因子设置为大于1的值,使用较大的辅助边界框计算损失。例如在模拟实验中,低IoU回归样本场景中,将比例因子ratio设置为1.2

2.2 优势

  • 与IoU损失相比,当比例小于1且辅助边界框尺寸小于实际边界框时,回归的有效范围小于IoU损失,但梯度的绝对值大于从IoU损失获得的梯度,能够加速高IoU样本的收敛。
  • 当比例大于1时,较大规模的辅助边界框扩大了回归的有效范围,增强了低IoU样本回归的效果。
  • 通过一系列模拟和对比实验,验证了该方法在检测性能和泛化能力方面优于现有方法,对于不同像素大小的数据集都能达到较好的效果。
  • 不仅适用于一般检测任务,对于目标非常小的检测任务也表现良好,证实了该方法的泛化性。

论文:https://arxiv.org/abs/2311.02877
源码:https://github.com/malagoutou/Inner-IoU

三、添加步骤

3.1 utils\metrics.py

此处需要查看的文件是ultralytics/utils/metrics.py

metrics.py中定义了模型的损失函数和计算方法,我们想要加入新的损失函数就只需要将代码放到这个文件内即可

Inner - IoU的代码添加到metrics.py中,如下:

def get_inner_iou(box1, box2, xywh=True, eps=1e-7, ratio=0.7):
    if xywh:  # transform from xywh to xyxy
        (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
        b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
        inner_b1_x1, inner_b1_x2, inner_b1_y1, inner_b1_y2 = x1 - w1_* ratio, x1 + w1_ * ratio, y1 - h1_ * ratio, y1 + h1_ * ratio
        inner_b2_x1, inner_b2_x2, inner_b2_y1, inner_b2_y2 = x2 - w2_* ratio, x2 + w2_ * ratio, y2 - h2_ * ratio, y2 + h2_ * ratio
    else:  # x1, y1, x2, y2 = box1
        b1_x1, b1_y1, b1_x2, b1_y2 = box1.chunk(4, -1)
        b2_x1, b2_y1, b2_x2, b2_y2 = box2.chunk(4, -1)
        w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
        w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
    
    # Intersection area
    inter = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * \
            (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
 
    # Union Area
    union = w1 * h1 * ratio * ratio + w2 * h2 * ratio * ratio - inter + eps
    return inter / union
 
def bbox_inner_iou(box1, box2, xywh=True, GIoU=False, DIoU=False, CIoU=False, EIoU=False, SIoU=False, eps=1e-7, ratio=0.7):
    """
    Calculate Intersection over Union (IoU) of box1(1, 4) to box2(n, 4).
    Args:
        box1 (torch.Tensor): A tensor representing a single bounding box with shape (1, 4).
        box2 (torch.Tensor): A tensor representing n bounding boxes with shape (n, 4).
        xywh (bool, optional): If True, input boxes are in (x, y, w, h) format. If False, input boxes are in
                               (x1, y1, x2, y2) format. Defaults to True.
        GIoU (bool, optional): If True, calculate Generalized IoU. Defaults to False.
        DIoU (bool, optional): If True, calculate Distance IoU. Defaults to False.
        CIoU (bool, optional): If True, calculate Complete IoU. Defaults to False.
        EIoU (bool, optional): If True, calculate Efficient IoU. Defaults to False.
        SIoU (bool, optional): If True, calculate Scylla IoU. Defaults to False.
        eps (float, optional): A small value to avoid division by zero. Defaults to 1e-7.
    Returns:
        (torch.Tensor): IoU, GIoU, DIoU, or CIoU values depending on the specified flags.
    """
 
    # Get the coordinates of bounding boxes
    if xywh:  # transform from xywh to xyxy
        (x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
        w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
        b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
        b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
    else:  # x1, y1, x2, y2 = box1
        b1_x1, b1_y1, b1_x2, b1_y2 = box1.chunk(4, -1)
        b2_x1, b2_y1, b2_x2, b2_y2 = box2.chunk(4, -1)
        w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
        w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps
 
    innner_iou = get_inner_iou(box1, box2, xywh=xywh, ratio=ratio)
    
    # Intersection area
    inter = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp_(0) * \
            (b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)).clamp_(0)
 
    # Union Area
    union = w1 * h1 + w2 * h2 - inter + eps
 
    # IoU
    iou = inter / union
    if CIoU or DIoU or GIoU or EIoU or SIoU:
        cw = b1_x2.maximum(b2_x2) - b1_x1.minimum(b2_x1)  # convex (smallest enclosing box) width
        ch = b1_y2.maximum(b2_y2) - b1_y1.minimum(b2_y1)  # convex height
        if CIoU or DIoU or EIoU or SIoU:  # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
            c2 = cw ** 2 + ch ** 2 + eps  # convex diagonal squared
            rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 + (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4  # center dist ** 2
            if CIoU:  # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
                v = (4 / math.pi ** 2) * (torch.atan(w2 / h2) - torch.atan(w1 / h1)).pow(2)
                with torch.no_grad():
                    alpha = v / (v - iou + (1 + eps))
                return innner_iou - (rho2 / c2 + v * alpha)  # CIoU
            elif EIoU:
                rho_w2 = ((b2_x2 - b2_x1) - (b1_x2 - b1_x1)) ** 2
                rho_h2 = ((b2_y2 - b2_y1) - (b1_y2 - b1_y1)) ** 2
                cw2 = cw ** 2 + eps
                ch2 = ch ** 2 + eps
                return innner_iou - (rho2 / c2 + rho_w2 / cw2 + rho_h2 / ch2) # EIoU
            elif SIoU:
                # SIoU Loss https://arxiv.org/pdf/2205.12740.pdf
                s_cw = (b2_x1 + b2_x2 - b1_x1 - b1_x2) * 0.5 + eps
                s_ch = (b2_y1 + b2_y2 - b1_y1 - b1_y2) * 0.5 + eps
                sigma = torch.pow(s_cw ** 2 + s_ch ** 2, 0.5)
                sin_alpha_1 = torch.abs(s_cw) / sigma
                sin_alpha_2 = torch.abs(s_ch) / sigma
                threshold = pow(2, 0.5) / 2
                sin_alpha = torch.where(sin_alpha_1 > threshold, sin_alpha_2, sin_alpha_1)
                angle_cost = torch.cos(torch.arcsin(sin_alpha) * 2 - math.pi / 2)
                rho_x = (s_cw / cw) ** 2
                rho_y = (s_ch / ch) ** 2
                gamma = angle_cost - 2
                distance_cost = 2 - torch.exp(gamma * rho_x) - torch.exp(gamma * rho_y)
                omiga_w = torch.abs(w1 - w2) / torch.max(w1, w2)
                omiga_h = torch.abs(h1 - h2) / torch.max(h1, h2)
                shape_cost = torch.pow(1 - torch.exp(-1 * omiga_w), 4) + torch.pow(1 - torch.exp(-1 * omiga_h), 4)
                return innner_iou - 0.5 * (distance_cost + shape_cost) + eps # SIoU
            return innner_iou - rho2 / c2  # DIoU
        c_area = cw * ch + eps  # convex area
        return innner_iou - (c_area - union) / c_area  # GIoU https://arxiv.org/pdf/1902.09630.pdf
    return innner_iou  # IoU


在这里插入图片描述

3.2 修改ultralytics/utils/loss.py

utils\loss.py用于计算各种损失。

ultralytics/utils/loss.py在的引用中添加bbox_inner_iou,然后在BboxLoss函数内修改如下代码,使模型调用此bbox_inner_iou损失函数。

在这里插入图片描述

3.2.1 Inner_CIou


iou = bbox_inner_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, CIoU=True)

在这里插入图片描述

3.2.2 Inner_GIou


iou = bbox_inner_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, GIoU=True)

3.2.3 Inner_DIou


iou = bbox_inner_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, DIoU=True)

3.2.4 Inner_EIou


iou = bbox_inner_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, EIoU=True)

3.2.5 Inner_SIou


iou = bbox_inner_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, SIoU=True)

3.3 修改ultralytics/utils/tal.py

tal.py中是一些损失函数的功能应用。

ultralytics/utils/tal.py在的引用中添加bbox_inner_iou,然后在iou_calculation函数内修改如下代码,使模型调用此bbox_inner_iou损失函数。

此处仅以Inner_CIou为例:

在这里插入图片描述

在这里插入图片描述

四、成功运行截图

在这里插入图片描述

五、总结

为了弥补现有 IoU 损失在不同检测任务中泛化性弱和收敛速度慢的问题,·Inner-IoU·通过引入比例因子 “ratio” 来控制辅助边界框的尺度大小,利用不同尺度的辅助边界框来计算损失,从而加速边界框回归过程。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Limiiiing

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值