IOU & GIOU & DIOU 介绍及其代码实现

最新推荐文章于 2024-04-30 07:43:30 发布

leonardohaig

最新推荐文章于 2024-04-30 07:43:30 发布

阅读量1.8w

点赞数 44

分类专栏：深度学习 Python

本文链接：https://blog.csdn.net/leonardohaig/article/details/103394369

版权

深度学习同时被 2 个专栏收录

12 篇文章 1 订阅

订阅专栏

Python

11 篇文章 1 订阅

订阅专栏

IOU & GIOU & DIOU 介绍及其代码实现

微信公众号：幼儿园的学霸
个人的学习笔记，关于OpenCV,关于机器学习, …。问题或建议，请公众号留言;

从IOU，到GIOU，然后到最近的DIOU、CIOU。

IOU

介绍

IoU 的全称为交并比（Intersection over Union），通过这个名称我们大概可以猜到 IoU 的计算方法。IoU 计算的是 “预测的边框” 和 “真实的边框” 的交集和并集的比值。计算过程如下：
图1 IOU计算

IOU的优点：

IOU可以作为距离，loss=1-IOU。但是当两个物体不相交时无回传梯度。
IOU对尺度变化具有不变性，即不受两个物体尺度大小的影响。
以A，B两个box重合的情况为例，若boxes1=[[0,0,10,10],[0,0,5,5]],boxes2=[[0,0,10,10],[0,0,5,5]],此时IOU=[1,1]
IOU的缺点：
无法衡量两框是相邻还是甚远
如下图2所示，两种情况下IOU均为0，(a)中两框距离较近，(b)中两框明显距离要远，但是仅从IOU数值上无法判断两者距离较近还是较远（两个物体不相交时无回传梯度）
IOU不能反映两个物体如何重叠（相交方式）。
如下图3所示，两种情况下的IOU均为0.1428，(a)中两框要比(b)中的相交更整齐一些，但是IOU并没有反映出这个特点。

计算代码

def bboxes_iou(boxes1,boxes2):
    '''
    cal IOU of two boxes or batch boxes
    such as: (1)
            boxes1 = np.asarray([[0,0,5,5],[0,0,10,10],[0,0,10,10]])
            boxes2 = np.asarray([[0,0,5,5]])
            and res is [1.   0.25 0.25]
            (2)
            boxes1 = np.asarray([[0,0,5,5],[0,0,10,10],[0,0,10,10]])
            boxes2 = np.asarray([[0,0,5,5],[0,0,10,10],[0,0,10,10]])
            and res is [1. 1. 1.]
    :param boxes1:[xmin,ymin,xmax,ymax] or
                [[xmin,ymin,xmax,ymax],[xmin,ymin,xmax,ymax],...]
    :param boxes2:[xmin,ymin,xmax,ymax]
    :return:
    '''

    #cal the box's area of boxes1 and boxess
    boxes1Area = (boxes1[...,2]-boxes1[...,0])*(boxes1[...,3]-boxes1[...,1])
    boxes2Area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    #cal Intersection
    left_up = np.maximum(boxes1[...,:2],boxes2[...,:2])
    right_down = np.minimum(boxes1[...,2:],boxes2[...,2:])

    inter_section = np.maximum(right_down-left_up,0.0)
    inter_area = inter_section[...,0] * inter_section[...,1]
    union_area = boxes1Area+boxes2Area-inter_area
    ious = np.maximum(1.0*inter_area/union_area,np.finfo(np.float32).eps)

    return ious

GIOU

介绍

GIOU是为克服IOU的缺点同时充分利用优点而提出的.(论文：Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression)

GIOU计算公式：

可以这样理解：

1.假设A为预测框，B为真实框，S是所有框的集合
2.不管A与B是否相交，C是包含A与B的最小框(包含A与B的最小凸闭合框)，C也属于S集合
3.首先计算IoU，A与B的交并比
4.再计算C框中没有A与B的面积，比上C框面积；
5.IoU减去前面算出的比；得到GIoU
6.Note：本文提出的例子中A、B均是矩形框，但是也可以为其他的。比如：A、B为椭圆形，那么此时C将是包含A、B的最小椭圆形；或者A、B均是3D box也可。

过程如下图5所示：

论文作者给出几个GIoU的性质：

对尺度的不变性
GIoU可认为是IoU的下界，小于等于IoU
如boxes1=[0,0,10,10],boxes2=[0,0,10,10],此时IOU=1,GIOU=1,这种情况下A与B重合
boxes1=[0,0,10,10],boxes2=[0,10,10,20],此时IOU=0,GIOU=0
boxes1=[0,0,10,10],boxes2=[5,5,15,15],此时IOU=0.1428,GIOU=-0.07936
-1<=GIoU<=1，当A=B时，GIoU=IoU=1；当A与B不相交而且离得很远时，GIoU(A,B)趋向于-1。
如boxes1=[[10,10,15,15],[100,100,105,105]],boxes2=[5,5,10,10],计算的GIOU为[-0.5,-0.995],即A与B不相交，随着两者距离增加，GIOU值将趋向于-1，如下图6所示。
因此选用loss=1-GIoU
GIoU能够更好地反应相交情况。如上面图3所示，虽然两种情况下IOU一致，但是(a)中相交的更为整齐，因此GIOU要比(b)中大。

图6 GIOU随着距离增加而变小

GIoU最主要的作用： (1)对于相交的框，IOU可以被反向传播，即它可以直接用作优化的目标函数。但是非相交的，梯度将会为0，无法优化。此时使用GIoU可以完全避免此问题。所以可以作为目标函数
(2)可以分辨框的对齐方式

代码

def bboxes_giou(boxes1,boxes2):
    '''
    cal GIOU of two boxes or batch boxes
    such as: (1)
            boxes1 = np.asarray([[0,0,5,5],[0,0,10,10],[15,15,25,25]])
            boxes2 = np.asarray([[5,5,10,10]])
            and res is [-0.49999988  0.25       -0.68749988]
            (2)
            boxes1 = np.asarray([[0,0,5,5],[0,0,10,10],[0,0,10,10]])
            boxes2 = np.asarray([[0,0,5,5],[0,0,10,10],[0,0,10,10]])
            and res is [1. 1. 1.]
    :param boxes1:[xmin,ymin,xmax,ymax] or
                [[xmin,ymin,xmax,ymax],[xmin,ymin,xmax,ymax],...]
    :param boxes2:[xmin,ymin,xmax,ymax]
    :return:
    '''

    # cal the box's area of boxes1 and boxess
    boxes1Area = (boxes1[...,2]-boxes1[...,0])*(boxes1[...,3]-boxes1[...,1])
    boxes2Area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    # ===========cal IOU=============#
    #cal Intersection
    left_up = np.maximum(boxes1[...,:2],boxes2[...,:2])
    right_down = np.minimum(boxes1[...,2:],boxes2[...,2:])

    inter_section = np.maximum(right_down-left_up,0.0)
    inter_area = inter_section[...,0] * inter_section[...,1]
    union_area = boxes1Area+boxes2Area-inter_area
    ious = np.maximum(1.0*inter_area/union_area,np.finfo(np.float32).eps)

    # ===========cal enclose area for GIOU=============#
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    enclose_area = enclose[..., 0] * enclose[..., 1]

    # cal GIOU
    gious = ious - 1.0 * (enclose_area - union_area) / enclose_area

    return gious

DIOU

介绍

由于IOU Loss在候选框和真实框没有重叠的时候不提供任何移动梯度（LIoU=1-IOU始终为1），于是GIOU Loss引入了一个惩罚项(即图5中的(C-A并B)/C)。由于惩罚项的引入，在不重叠的情况下，预测框会向目标框移动。
但是考虑如下图图7情况。

当出现上图情况时，GIoU Loss完全降级成IoU Loss，因此引入DIoU Loss
，DIoU Loss是在IoU Loss基础上引入一个惩罚项，定义如下：
计算公式
上述损失函数中，b，bgt分别代表了anchor框和目标框的中心点，且p代表的是计算两个中心点间的欧式距离。c代表的是能够同时覆盖anchor和目标框的最小矩形的对角线距离。因此DIoU中对anchor框和目标框之间的归一化距离进行了建模。直观的展示如下图所示。
图8 距离计算展示

DIoU的优点如下：
1.与GIoU loss类似，DIoU loss在与目标框不重叠时，仍然可以为边界框提供移动方向。
2.DIoU loss可以直接最小化两个目标框的距离，而GIOU loss优化的是两个目标框之间的面积，因此比GIoU loss收敛快得多。
3.对于包含两个框在水平方向和垂直方向上这种情况，DIoU损失可以使回归非常快，而GIoU损失几乎退化为IoU损失

代码

def bboxes_diou(boxes1,boxes2):
    '''
    cal DIOU of two boxes or batch boxes
    :param boxes1:[xmin,ymin,xmax,ymax] or
                [[xmin,ymin,xmax,ymax],[xmin,ymin,xmax,ymax],...]
    :param boxes2:[xmin,ymin,xmax,ymax]
    :return:
    '''

    #cal the box's area of boxes1 and boxess
    boxes1Area = (boxes1[...,2]-boxes1[...,0])*(boxes1[...,3]-boxes1[...,1])
    boxes2Area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    #cal Intersection
    left_up = np.maximum(boxes1[...,:2],boxes2[...,:2])
    right_down = np.minimum(boxes1[...,2:],boxes2[...,2:])

    inter_section = np.maximum(right_down-left_up,0.0)
    inter_area = inter_section[...,0] * inter_section[...,1]
    union_area = boxes1Area+boxes2Area-inter_area
    ious = np.maximum(1.0*inter_area/union_area,np.finfo(np.float32).eps)

    #cal outer boxes
    outer_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    outer_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    outer = np.maximum(outer_right_down - outer_left_up, 0.0)
    outer_diagonal_line = np.square(outer[...,0]) + np.square(outer[...,1])

    #cal center distance
    boxes1_center = (boxes1[..., :2] +  boxes1[...,2:]) * 0.5
    boxes2_center = (boxes2[..., :2] +  boxes2[...,2:]) * 0.5
    center_dis = np.square(boxes1_center[...,0]-boxes2_center[...,0]) +\
                 np.square(boxes1_center[...,1]-boxes2_center[...,1])

    #cal diou
    dious = ious - center_dis / outer_diagonal_line

    return dious

CIOU

介绍

一个好的目标框回归损失应该考虑三个重要的几何因素：重叠面积、中心点距离、长宽比。
GIoU：为了归一化坐标尺度，利用IoU，并初步解决IoU为零的情况。
DIoU：DIoU损失同时考虑了边界框的重叠面积和中心点距离。
然而，anchor框和目标框之间的长宽比的一致性也是极其重要的。基于此，论文作者提出了Complete-IoU Loss。
CIOU Loss又引入一个box长宽比的惩罚项，该Loss考虑了box的长宽比，定义如下:
CIOU Loss计算
上述损失函数中，CIoU比DIoU多出了α和v这两个参数。其中α是用于平衡比例的参数。v用来衡量anchor框和目标框之间的比例一致性。
惩罚项说明
从α参数的定义可以看出，损失函数会更加倾向于往重叠区域增多方向优化，尤其是IoU为零的时候。

DIOU CIOU结果分析

图9 DIOU CIOU和GIOU结果对比分析

代码

关于惩罚项的计算，我是按照论文中公式来编写的。不过可能与这个文章中代码的计算不一样，有待分析差别。

def bboxes_ciou(boxes1,boxes2):
    '''
    cal CIOU of two boxes or batch boxes
    :param boxes1:[xmin,ymin,xmax,ymax] or
                [[xmin,ymin,xmax,ymax],[xmin,ymin,xmax,ymax],...]
    :param boxes2:[xmin,ymin,xmax,ymax]
    :return:
    '''

    #cal the box's area of boxes1 and boxess
    boxes1Area = (boxes1[...,2]-boxes1[...,0])*(boxes1[...,3]-boxes1[...,1])
    boxes2Area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    # cal Intersection
    left_up = np.maximum(boxes1[...,:2],boxes2[...,:2])
    right_down = np.minimum(boxes1[...,2:],boxes2[...,2:])

    inter_section = np.maximum(right_down-left_up,0.0)
    inter_area = inter_section[...,0] * inter_section[...,1]
    union_area = boxes1Area+boxes2Area-inter_area
    ious = np.maximum(1.0*inter_area/union_area,np.finfo(np.float32).eps)

    # cal outer boxes
    outer_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    outer_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    outer = np.maximum(outer_right_down - outer_left_up, 0.0)
    outer_diagonal_line = np.square(outer[...,0]) + np.square(outer[...,1])

    # cal center distance
    boxes1_center = (boxes1[..., :2] +  boxes1[...,2:]) * 0.5
    boxes2_center = (boxes2[..., :2] +  boxes2[...,2:]) * 0.5
    center_dis = np.square(boxes1_center[...,0]-boxes2_center[...,0]) +\
                 np.square(boxes1_center[...,1]-boxes2_center[...,1])

    # cal penalty term
    # cal width,height
    boxes1_size = np.maximum(boxes1[...,2:]-boxes1[...,:2],0.0)
    boxes2_size = np.maximum(boxes2[..., 2:] - boxes2[..., :2], 0.0)
    v = (4.0/np.square(np.pi)) * np.square((
            np.arctan((boxes1_size[...,0]/boxes1_size[...,1])) -
            np.arctan((boxes2_size[..., 0] / boxes2_size[..., 1])) ))
    alpha = v / (1-ious+v)


    #cal ciou
    cious = ious - (center_dis / outer_diagonal_line + alpha*v)

    return cious

总结

DIoU要比GIou更加符合目标框回归的机制，将目标与anchor之间的距离，重叠率以及尺度都考虑进去，使得目标框回归变得更加稳定，不会像IoU和GIoU一样出现训练过程中发散等问题。

参考资料

1.论文笔记：GIoU
2.目标检测------CVPR2019------对GIOU的认识
3.【论文阅读】Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression
4.论文地址
5.DIoU YOLOv3 | AAAI 2020：更加稳定有效的目标框回归损失

下面的是我的公众号二维码图片，欢迎关注。
图注:幼儿园的学霸

leonardohaig

关注

44
点赞
踩
204

收藏

觉得还不错? 一键收藏
15
评论
IOU & GIOU & DIOU 介绍及其代码实现

IOU & GIOU & DIOU 介绍及其代码实现微信公众号：幼儿园的学霸个人的学习笔记，关于OpenCV,关于机器学习, …。问题或建议，请公众号留言;从IOU，到GIOU，然后到最近的DIOU、CIOU。目录文章目录IOU & GIOU & DIOU 介绍及其代码实现目录IOU介绍计算代码GIOU介绍代码DIOU介绍代码CIOU介绍DIOU CI...
复制链接

扫一扫