一、IOU Loss
上一篇文章提到L1,L2及其变种只将Bounding box的四个角点分别求loss然后相加,没有引入box四个顶点之间的相关性并且模型在训练过程中更偏向于尺寸更大的物体。在此基础上旷视在2016文章《UnitBox: An Advanced Object Detection Network》中提出了IOU Loss将4个点构成的box看成一个整体做回归。
文章链接:https://arxiv.org/pdf/1608.01471.pdf
1. 函数特性
IOU Loss的定义是先求出预测框和真实框之间的交集和并集之比,再求负对数,但是在实际使用中我们常常将IOU Loss写成1-IOU。如果两个框重合则交并比等于1,Loss为0说明重合度非常高。
![IOU = \frac{(A\cap B)}{(A\cup B)}](https://i-blog.csdnimg.cn/blog_migrate/832f4cd043ff0453dae6c7c893830a2e.png)
![IOU Loss = 1 - IOU](https://i-blog.csdnimg.cn/blog_migrate/8da1704337f34597234ec209f3b713a4.png)
IOU满足非负性、同一性、对称性、三角不等性,相比于L1/L2等损失函数还具有尺度不变性,不论box的尺度大小,输出的iou损失总是在0-1之间。所以能够较好的反映预测框与真实框的检测效果。
伪代码如下:
![v2-9b747dbf794d151312aa0db1d4a5908d_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/96d60ed2892e2831f51d81608232b7a6.jpeg)
其中,
![X](https://i-blog.csdnimg.cn/blog_migrate/6ce37be68c2df90fbb8ad8fe13a3f033.png)
![\tilde{X}](https://i-blog.csdnimg.cn/blog_migrate/6baaaefca419097c16d0ae5516aea9ae.png)
![I](https://i-blog.csdnimg.cn/blog_migrate/b1bc50890338d82b9a5cd146970b8e93.png)
![U](https://i-blog.csdnimg.cn/blog_migrate/38190287d52065026ea376fbf9b49712.png)
![L](https://i-blog.csdnimg.cn/blog_migrate/600ccfe8e45e11592cb9c011c4107772.png)
box位置的修正是通过对loss的反向传播迭代计算的。关于IOU Loss的反向传播具体推到过程可以移步到论文中,这里摘出结论部分如下:
![\frac{\partial L}{\partial x_{t}} = \frac{1}{U}\frac{\partial X}{\partial x_{t}} - \frac{U+I}{UI}\frac{\partial I}{\partial x_{t}}](https://i-blog.csdnimg.cn/blog_migrate/c5dbf5f4fed8f2a1ea02b23f57aa3fb7.png)
其中:
![\frac{\partial X}{x_{t}} = x_{l} + x_{r}](https://i-blog.csdnimg.cn/blog_migrate/d6b49eba25ecb41873796bda94e99152.png)
![\frac{\partial I}{x_{t}} = I_{w}, x_{t} <\bar{x}_{t} ( x_{b} <\bar{x}_{b} )](https://i-blog.csdnimg.cn/blog_migrate/2179317c4de26155ed7b0a8f1d2d0f0f.png)
从这个公式可以看出惩罚来自两个部分,预测框四个变量和预测框和真实框相交区域:
1 .损失函数和
![\frac{\partial X}{x_t}](https://i-blog.csdnimg.cn/blog_migrate/1d8c2212fa27ff004fa48bbfe2cc79f6.png)
2 .同时损失函数和
![\frac{\partial I}{x_t}](https://i-blog.csdnimg.cn/blog_migrate/00b9bb6a44c995b7874a4280e93294fe.png)
根据求导公式为了减小IOU Loss,会尽可能增大相交面积同时预测更小的框。
Python实现如下:
def calculate_iou(box_1, box_2):
"""
calculate iou
:param box_1: (x0, y0, x1, y1)
:param box_2: (x0, y0, x1, y1)
:return: value of iou
"""
# calculate area of each box
area_1 = (box_1[2] - box_1[0]) * (box_1[3] - box_1[1])
area_2 = (box_2[2] - box_2[0]) * (box_1[3] - box_1[1])
# find the edge of intersect box
top = max(box_1[0], box_2[0])
left = max(box_1[1], box_2[1])
bottom = min(box_1[3], box_2[3])
right = min(box_1[2], box_2[2])
# if there is an intersect area
if left >= right or top >= bottom:
return 0
# calculate the intersect area
area_intersection = (right - left) * (bottom - top)
# calculate the union area
area_union = area_1 + area_2 - area_intersection
iou = float(area_intersection) / area_union
return iou
Tensorflow实现如下:
def bbox_iou(self, boxes_1, boxes_2):
"""
calculate regression loss using iou
:param boxes_1: boxes_1 shape is [x, y, w, h]
:param boxes_2: boxes_2 shape is [x, y, w, h]
:return:
"""
# transform [x, y, w, h] to [x_min, y_min, x_max, y_max]
boxes_1 = tf.concat([boxes_1[..., :2] - boxes_1[..., 2:] * 0.5,
boxes_1[..., :2] + boxes_1[..., 2:] * 0.5], axis=-1)
boxes_2 = tf.concat([boxes_2[..., :2] - boxes_2[..., 2:] * 0.5,
boxes_2[..., :2] + boxes_2[..., 2:] * 0.5], axis=-1)
boxes_1 = tf.concat([tf.minimum(boxes_1[..., :2], boxes_1[..., 2:]),
tf.maximum(boxes_1[..., :2], boxes_1[..., 2:])], axis=-1)
boxes_2 = tf.concat([tf.minimum(boxes_2[..., :2], boxes_2[..., 2:]),
tf.maximum(boxes_2[..., :2], boxes_2[..., 2:])], axis=-1)
# calculate area of boxes_1 boxes_2
boxes_1_area = (boxes_1[..., 2] - boxes_1[..., 0]) * (boxes_1[..., 3] - boxes_1[..., 1])
boxes_2_area = (boxes_2[..., 2] - boxes_2[..., 0]) * (boxes_2[..., 3] - boxes_2[..., 1])
# calculate the two corners of the intersection
left_up = tf.maximum(boxes_1[..., :2], boxes_2[..., :2])
right_down = tf.minimum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate area of intersection
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
# calculate union area
union_area = boxes_1_area + boxes_2_area - inter_area
# calculate iou add epsilon in denominator to avoid dividing by 0
iou = inter_area / (union_area + tf.keras.backend.epsilon())
return iou
2. 存在的问题
IOU Loss虽然解决了Smooth L1系列变量相互独立和不具有尺度不变性的两大问题,但是它也存在两个问题:
- 预测框和真实框不相交时,不能反映出两个框的距离的远近。根据IOU定义loss等于0,没有梯度的回传无法进一步学习训练。
- 预测框和真实框无法反映重合度大小。借用一张图来说,三者具有相同的IOU,但是不能反映两个框是如何相交的,从直观上感觉第三种重合方式是最差的。
![v2-f0ba257a866d35ed998fd7df6c8565b5_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/2d6e088eb3d6caf9d5fd5d52cfae18d3.jpeg)
二、GIOU Loss
上面指出IOU Loss的两大缺点:无法优化两个框不相交的情况;无法反映两个框如何相交的。针对此类问题斯坦福学者在2019年的文章《Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression》中提出了GIOU Loss,在IOU的基础上引入了预测框和真实框的最小外接矩形。
文章链接:https://arxiv.org/pdf/1608.01471.pdf
1.函数特性
GIOU作为IOU的升级版,保持了 IOU 的主要性质并避免了 IOU 的缺点,首先计算预测框
![B^{p}](https://i-blog.csdnimg.cn/blog_migrate/ec591a9a5d387a04bd3d4b04ac7fa62c.png)
![B^{g}](https://i-blog.csdnimg.cn/blog_migrate/aa5206ace2c5cabc47e598dcc6d20bf4.png)
![A^{p}](https://i-blog.csdnimg.cn/blog_migrate/cb151f183066c46c9df0ebf2a48cc070.png)
![A^{g}](https://i-blog.csdnimg.cn/blog_migrate/9e19c53597123866d770da507dd354fd.png)
![B^{c}](https://i-blog.csdnimg.cn/blog_migrate/e1c775f61bf2c28684f5f6e31dace5c7.png)
![A^{c}](https://i-blog.csdnimg.cn/blog_migrate/6f09ab33108f3013dd76e387130be11a.png)
![B^{p}](https://i-blog.csdnimg.cn/blog_migrate/ec591a9a5d387a04bd3d4b04ac7fa62c.png)
![B^{g}](https://i-blog.csdnimg.cn/blog_migrate/aa5206ace2c5cabc47e598dcc6d20bf4.png)
![A^{c}](https://i-blog.csdnimg.cn/blog_migrate/6f09ab33108f3013dd76e387130be11a.png)
![B^{p}](https://i-blog.csdnimg.cn/blog_migrate/ec591a9a5d387a04bd3d4b04ac7fa62c.png)
![B^{g}](https://i-blog.csdnimg.cn/blog_migrate/aa5206ace2c5cabc47e598dcc6d20bf4.png)
![A^{c}](https://i-blog.csdnimg.cn/blog_migrate/6f09ab33108f3013dd76e387130be11a.png)
伪代码如下:
![v2-9a5f8f5f6e7d1da67278b1cf74658000_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/b586753df14d119b81d5585d98d5bc1b.jpeg)
从GIOU的原理可以看出:
-
与
类似采用距离度量损失函数,并且对尺度不敏感
-
,而
,所以
-
不仅关注重叠区域,还关注其他的非重合区域,能更好的反映两者的重合度
- 当预测框和真实框完全重合时,
- 当预测框和真实框不重合时,不重合度越高,GIOU越趋近于-1
- 特别是预测框和真实框不相交时,由于引入了预测框和真实框的最小外接矩形
最大化GIOU就是促使
最小两个框
和
不断靠近。
Python实现如下:
def calculate_giou(box_1, box_2):
"""
calculate giou
:param box_1: (x0, y0, x1, y1)
:param box_2: (x0, y0, x1, y1)
:return: value of giou
"""
# calculate area of each box
area_1 = (box_1[2] - box_1[0]) * (box_1[3] - box_1[1])
area_2 = (box_2[2] - box_2[0]) * (box_1[3] - box_1[1])
# calculate minimum external frame
area_c = (max(box_1[2], box_2[2]) - min(box_1[0], box_2[0])) * (max(box_1[3], box_2[3]) - min(box_1[1], box_2[1]))
# find the edge of intersect box
top = max(box_1[0], box_2[0])
left = max(box_1[1], box_2[1])
bottom = min(box_1[3], box_2[3])
right = min(box_1[2], box_2[2])
# calculate the intersect area
area_intersection = (right - left) * (bottom - top)
# calculate the union area
area_union = area_1 + area_2 - area_intersection
# calculate iou
iou = float(area_intersection) / area_union
# calculate giou(iou - (area_c - area_union)/area_c)
giou = iou - float((area_c - area_union)) / area_c
return giou
Tensorflow实现如下:
def bbox_giou(self, boxes_1, boxes_2):
"""
calculate regression loss using giou
:param boxes_1: boxes_1 shape is [x, y, w, h]
:param boxes_2: boxes_2 shape is [x, y, w, h]
:return:
"""
# transform [x, y, w, h] to [x_min, y_min, x_max, y_max]
boxes_1 = tf.concat([boxes_1[..., :2] - boxes_1[..., 2:] * 0.5,
boxes_1[..., :2] + boxes_1[..., 2:] * 0.5], axis=-1)
boxes_2 = tf.concat([boxes_2[..., :2] - boxes_2[..., 2:] * 0.5,
boxes_2[..., :2] + boxes_2[..., 2:] * 0.5], axis=-1)
boxes_1 = tf.concat([tf.minimum(boxes_1[..., :2], boxes_1[..., 2:]),
tf.maximum(boxes_1[..., :2], boxes_1[..., 2:])], axis=-1)
boxes_2 = tf.concat([tf.minimum(boxes_2[..., :2], boxes_2[..., 2:]),
tf.maximum(boxes_2[..., :2], boxes_2[..., 2:])], axis=-1)
# calculate area of boxes_1 boxes_2
boxes_1_area = (boxes_1[..., 2] - boxes_1[..., 0]) * (boxes_1[..., 3] - boxes_1[..., 1])
boxes_2_area = (boxes_2[..., 2] - boxes_2[..., 0]) * (boxes_2[..., 3] - boxes_2[..., 1])
# calculate the two corners of the intersection
left_up = tf.maximum(boxes_1[..., :2], boxes_2[..., :2])
right_down = tf.minimum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate area of intersection
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
# calculate union area
union_area = boxes_1_area + boxes_2_area - inter_area
# calculate iou add epsilon in denominator to avoid dividing by 0
iou = inter_area / (union_area + tf.keras.backend.epsilon())
# calculate the upper left and lower right corners of the minimum closed convex surface
enclose_left_up = tf.minimum(boxes_1[..., :2], boxes_2[..., :2])
enclose_right_down = tf.maximum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate width and height of the minimun closed convex surface
enclose_wh = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
# calculate area of the minimun closed convex surface
enclose_area = enclose_wh[..., 0] * enclose_wh[..., 1]
# calculate the giou add epsilon in denominator to avoid dividing by 0
giou = iou - 1.0 * (enclose_area - union_area) / (enclose_area + tf.keras.backend.epsilon())
return giou
2. 存在的问题
在预测框和真实框没有很好地对齐时,会导致最小外接框C的面积增大,从而使GIOU的值变小,而两个矩形框不重合时,也可以计算GIOU。GIOU Loss虽然解决了IOU的上述两个问题,但是当两个框属于包含关系时,借用下图来说:GIOU会退化成IOU,无法区分其相对位置关系。
![v2-3ac89cfc726f6179bb286eb82ce1305e_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/527c4e7bc15f554b722f54a19ea6a158.jpeg)
由于GIOU仍然严重依赖IOU,因此在两个垂直方向,误差很大,基本很难收敛,这就是GIoU不稳定的原因。借用下图来说:红框内部分:C为两个框的最小外接矩形,此部分表征除去两个框的其余面积,预测框和真实框在相同距离的情况下,水平垂直方向时,此部分面积最小,对loss的贡献也就越小,从而导致在垂直水平方向上回归效果较差。
![v2-65fd81e0d1f59819b1a773d94f0d5d63_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/b823ee25ff65e89ee9bfc3412499d0b3.jpeg)
三、DIOU Loss
针对上述GIOU的两个问题(预测框和真实框是包含关系的情况或者处于水平/垂直方向上,GIOU损失几乎已退化为IOU损失,即
![|C - A\cup B|\rightarrow0](https://i-blog.csdnimg.cn/blog_migrate/c95c86cc0721a54f2ddb1ab2f5152013.png)
文章地址:https://arxiv.org/pdf/1911.08287.pdf
1. 函数特性
DIOU损失函数公式如下:
其中,
![b](https://i-blog.csdnimg.cn/blog_migrate/111cf5c2cf1cc54e0b93080ff933d2e8.png)
![b^{gt}](https://i-blog.csdnimg.cn/blog_migrate/111cf5c2cf1cc54e0b93080ff933d2e8.png%5E%7Bgt%7D)
![\rho](https://i-blog.csdnimg.cn/blog_migrate/04da85034a607153310d1887b7e2936b.png)
![c](https://i-blog.csdnimg.cn/blog_migrate/b67fcc1bfd7eae0e5948d822ee202111.png)
![v2-bdd13158b9076c16915f60a4a3d675c5_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/401f48978b72bca156b37311f08b2084.jpeg)
DIOU Loss的惩罚项能够直接最小化中心点间的距离,而GIOU Loss旨在减少外界包围框的面积,所以DIOU Loss具有以下特性:
- DIOU与IOU、GIOU一样具有尺度不变性;
- DIOU与GIOU一样在与目标框不重叠时,仍然可以为边界框提供移动方向;
- DIOU可以直接最小化两个目标框的距离,因此比GIOU Loss收敛快得多;
- DIOU在包含两个框水平/垂直方向上的情况回归很快,而GIOU几乎退化为IOU;
- 当预测框和真实框完全重合时,
;
- 当预测框和真实框不相交时,
;
Python实现如下:
def calculate_diou(box_1, box_2):
"""
calculate diou
:param box_1: (x0, y0, x1, y1)
:param box_2: (x0, y0, x1, y1)
:return: value of diou
"""
# calculate area of each box
area_1 = (box_1[2] - box_1[0]) * (box_1[3] - box_1[1])
area_2 = (box_2[2] - box_2[0]) * (box_1[3] - box_1[1])
# calculate center point of each box
center_x1 = (box_1[2] - box_1[0]) / 2
center_y1 = (box_1[3] - box_1[1]) / 2
center_x2 = (box_2[2] - box_2[0]) / 2
center_y2 = (box_2[3] - box_2[1]) / 2
# calculate square of center point distance
p2 = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
# calculate square of the diagonal length
width_c = max(box_1[2], box_2[2]) - min(box_1[0], box_2[0])
height_c = max(box_1[3], box_2[3]) - min(box_1[1], box_2[1])
c2 = width_c ** 2 + height_c ** 2
# find the edge of intersect box
top = max(box_1[0], box_2[0])
left = max(box_1[1], box_2[1])
bottom = min(box_1[3], box_2[3])
right = min(box_1[2], box_2[2])
# calculate the intersect area
area_intersection = (right - left) * (bottom - top)
# calculate the union area
area_union = area_1 + area_2 - area_intersection
# calculate iou
iou = float(area_intersection) / area_union
# calculate diou(iou - p2/c2)
diou = iou - float(p2) / c2
return diou
Tensorflow实现如下:
def bbox_diou(self, boxes_1, boxes_2):
"""
calculate regression loss using diou
:param boxes_1: boxes_1 shape is [x, y, w, h]
:param boxes_2: boxes_2 shape is [x, y, w, h]
:return:
"""
# calculate center distance
center_distance = tf.reduce_sum(tf.square(boxes_1[..., :2] - boxes_2[..., :2]), axis=-1)
# transform [x, y, w, h] to [x_min, y_min, x_max, y_max]
boxes_1 = tf.concat([boxes_1[..., :2] - boxes_1[..., 2:] * 0.5,
boxes_1[..., :2] + boxes_1[..., 2:] * 0.5], axis=-1)
boxes_2 = tf.concat([boxes_2[..., :2] - boxes_2[..., 2:] * 0.5,
boxes_2[..., :2] + boxes_2[..., 2:] * 0.5], axis=-1)
boxes_1 = tf.concat([tf.minimum(boxes_1[..., :2], boxes_1[..., 2:]),
tf.maximum(boxes_1[..., :2], boxes_1[..., 2:])], axis=-1)
boxes_2 = tf.concat([tf.minimum(boxes_2[..., :2], boxes_2[..., 2:]),
tf.maximum(boxes_2[..., :2], boxes_2[..., 2:])], axis=-1)
# calculate area of boxes_1 boxes_2
boxes_1_area = (boxes_1[..., 2] - boxes_1[..., 0]) * (boxes_1[..., 3] - boxes_1[..., 1])
boxes_2_area = (boxes_2[..., 2] - boxes_2[..., 0]) * (boxes_2[..., 3] - boxes_2[..., 1])
# calculate the two corners of the intersection
left_up = tf.maximum(boxes_1[..., :2], boxes_2[..., :2])
right_down = tf.minimum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate area of intersection
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
# calculate union area
union_area = boxes_1_area + boxes_2_area - inter_area
# calculate IoU, add epsilon in denominator to avoid dividing by 0
iou = inter_area / (union_area + tf.keras.backend.epsilon())
# calculate the upper left and lower right corners of the minimum closed convex surface
enclose_left_up = tf.minimum(boxes_1[..., :2], boxes_2[..., :2])
enclose_right_down = tf.maximum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate width and height of the minimun closed convex surface
enclose_wh = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
# calculate enclosed diagonal distance
enclose_diagonal = tf.reduce_sum(tf.square(enclose_wh), axis=-1)
# calculate diou add epsilon in denominator to avoid dividing by 0
diou = iou - 1.0 * center_distance / (enclose_diagonal + tf.keras.backend.epsilon())
return diou
2. 存在的问题
虽然DIOU能够直接最小化预测框和真实框的中心点距离加速收敛,但是Bounding box的回归还有一个重要的因素纵横比暂未考虑。
四、CIOU Loss
CIOU Loss 和 DIOU Loss出自于2020年同一篇文章,CIOU在DIOU的基础上将Bounding box的纵横比考虑进损失函数中,进一步提升了回归精度。
1. 函数特性
CIOU的惩罚项是在DIOU的惩罚项基础上加了一个影响因子
![\alpha\upsilon](https://i-blog.csdnimg.cn/blog_migrate/8db682b0b25d50496ba3ecd9b37c9b5a.png)
![R_{CIOU} =\frac{\rho^{2}\left( b,b^{gt} \right)}{c^{2}} + \alpha\upsilon](https://i-blog.csdnimg.cn/blog_migrate/6cae111ba9bdaff206db993395cc9458.png)
其中
![\alpha](https://i-blog.csdnimg.cn/blog_migrate/06b31929d317c72debb26113c3552ccc.png)
![\alpha](https://i-blog.csdnimg.cn/blog_migrate/06b31929d317c72debb26113c3552ccc.png)
![\alpha= \frac{\upsilon}{\left( 1-IOU \right)+\upsilon}](https://i-blog.csdnimg.cn/blog_migrate/366aa7c7cef62eda631c27aa457b1fce.png)
![\upsilon](https://i-blog.csdnimg.cn/blog_migrate/2aa4be7d439c7781cea3b054d20c0df7.png)
![\upsilon](https://i-blog.csdnimg.cn/blog_migrate/2aa4be7d439c7781cea3b054d20c0df7.png)
![\upsilon=\frac{4}{\pi^{2}}\left( arctan\frac{w^{gt}}{h^{gt}} - arctan\frac{w}{h} \right)^{2}](https://i-blog.csdnimg.cn/blog_migrate/2aa4be7d439c7781cea3b054d20c0df7.png%3D%5Cfrac%7B4%7D%7B%5Cpi%5E%7B2%7D%7D%5Cleft%28+arctan%5Cfrac%7Bw%5E%7Bgt%7D%7D%7Bh%5E%7Bgt%7D%7D+-+arctan%5Cfrac%7Bw%7D%7Bh%7D+%5Cright%29%5E%7B2%7D)
完整的CIOU损失函数的公式如下:
CIOU Loss的梯度在长宽
![[0,1]](https://i-blog.csdnimg.cn/blog_migrate/6179aae8b80f0596707e6d29907d220a.png)
![w^{2}+h^{2}](https://i-blog.csdnimg.cn/blog_migrate/d7fd28848f0376c6349b40d369ff2586.png)
![\frac{1}{w^{2}+h^{2}}](https://i-blog.csdnimg.cn/blog_migrate/8b598290364742dbc2fd9de039df9fe0.png)
Python实现如下:
def calculate_ciou(box_1, box_2):
"""
calculate ciou
:param box_1: (x0, y0, x1, y1)
:param box_2: (x0, y0, x1, y1)
:return: value of ciou
"""
# calculate area of each box
width_1 = box_1[2] - box_1[0]
height_1 = box_1[3] - box_1[1]
area_1 = width_1 * height_1
width_2 = box_2[2] - box_2[0]
height_2 = box_2[3] - box_2[1]
area_2 = width_2 * height_2
# calculate center point of each box
center_x1 = (box_1[2] - box_1[0]) / 2
center_y1 = (box_1[3] - box_1[1]) / 2
center_x2 = (box_2[2] - box_2[0]) / 2
center_y2 = (box_2[3] - box_2[1]) / 2
# calculate square of center point distance
p2 = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
# calculate square of the diagonal length
width_c = max(box_1[2], box_2[2]) - min(box_1[0], box_2[0])
height_c = max(box_1[3], box_2[3]) - min(box_1[1], box_2[1])
c2 = width_c ** 2 + height_c ** 2
# find the edge of intersect box
left = max(box_1[0], box_2[0])
top = max(box_1[1], box_2[1])
bottom = min(box_1[3], box_2[3])
right = min(box_1[2], box_2[2])
# calculate the intersect area
area_intersection = (right - left) * (bottom - top)
# calculate the union area
area_union = area_1 + area_2 - area_intersection
# calculate iou
iou = float(area_intersection) / area_union
# calculate v
arctan = math.atan(float(width_2) / height_2) - math.atan(float(width_1) / height_1)
v = (4.0 / math.pi ** 2) * (arctan ** 2)
# calculate alpha
alpha = float(v) / (1 - iou + v)
# calculate ciou(iou - p2 / c2 - alpha * v)
ciou = iou - float(p2) / c2 - alpha * v
return ciou
Tensorflow实现如下:
def box_ciou(self, boxes_1, boxes_2):
"""
calculate regression loss using ciou
:param boxes_1: boxes_1 shape is [x, y, w, h]
:param boxes_2: boxes_2 shape is [x, y, w, h]
:return:
"""
# calculate center distance
center_distance = tf.reduce_sum(tf.square(boxes_1[..., :2] - boxes_2[..., :2]), axis=-1)
v = 4 * tf.square(tf.math.atan2(boxes_1[..., 2], boxes_1[..., 3]) - tf.math.atan2(boxes_2[..., 2], boxes_2[..., 3])) / (math.pi * math.pi)
# transform [x, y, w, h] to [x_min, y_min, x_max, y_max]
boxes_1 = tf.concat([boxes_1[..., :2] - boxes_1[..., 2:] * 0.5,
boxes_1[..., :2] + boxes_1[..., 2:] * 0.5], axis=-1)
boxes_2 = tf.concat([boxes_2[..., :2] - boxes_2[..., 2:] * 0.5,
boxes_2[..., :2] + boxes_2[..., 2:] * 0.5], axis=-1)
boxes_1 = tf.concat([tf.minimum(boxes_1[..., :2], boxes_1[..., 2:]),
tf.maximum(boxes_1[..., :2], boxes_1[..., 2:])], axis=-1)
boxes_2 = tf.concat([tf.minimum(boxes_2[..., :2], boxes_2[..., 2:]),
tf.maximum(boxes_2[..., :2], boxes_2[..., 2:])], axis=-1)
# calculate area of boxes_1 boxes_2
boxes_1_area = (boxes_1[..., 2] - boxes_1[..., 0]) * (boxes_1[..., 3] - boxes_1[..., 1])
boxes_2_area = (boxes_2[..., 2] - boxes_2[..., 0]) * (boxes_2[..., 3] - boxes_2[..., 1])
# calculate the two corners of the intersection
left_up = tf.maximum(boxes_1[..., :2], boxes_2[..., :2])
right_down = tf.minimum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate area of intersection
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
# calculate union area
union_area = boxes_1_area + boxes_2_area - inter_area
# calculate IoU, add epsilon in denominator to avoid dividing by 0
iou = inter_area / (union_area + tf.keras.backend.epsilon())
# calculate the upper left and lower right corners of the minimum closed convex surface
enclose_left_up = tf.minimum(boxes_1[..., :2], boxes_2[..., :2])
enclose_right_down = tf.maximum(boxes_1[..., 2:], boxes_2[..., 2:])
# calculate width and height of the minimun closed convex surface
enclose_wh = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
# calculate enclosed diagonal distance
enclose_diagonal = tf.reduce_sum(tf.square(enclose_wh), axis=-1)
# calculate diou
diou = iou - 1.0 * center_distance / (enclose_diagonal + tf.keras.backend.epsilon())
# calculate param v and alpha to CIoU
alpha = v / (1.0 - iou + v)
# calculate ciou
ciou = diou - alpha * v
return ciou
2. 存在的问题
纵横比权重的设计还不太明白,是否有更好的设计方式有待更新。
五、EIOU Loss
CIOU Loss虽然考虑了边界框回归的重叠面积、中心点距离、纵横比。但是通过其公式中的v反映的纵横比的差异,而不是宽高分别与其置信度的真实差异,所以有时会阻碍模型有效的优化相似性。针对这一问题,有学者在CIOU的基础上将纵横比拆开,提出了EIOU Loss,并且加入Focal聚焦优质的锚框,该方法出自于2021年的一篇文章《Focal and Efficient IOU Loss for Accurate Bounding Box Regression》
文章链接:https://arxiv.org/pdf/2101.08158.pdf
1. 函数特性
EIOU的惩罚项是在CIOU的惩罚项基础上将纵横比的影响因子拆开分别计算目标框和锚框的长和宽,该损失函数包含三个部分:重叠损失,中心距离损失,宽高损失,前两部分延续CIOU中的方法,但是宽高损失直接使目标盒与锚盒的宽度和高度之差最小,使得收敛速度更快。惩罚项公式如下:
![v2-ede5791bb69791c73c049793bac359ec_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/0b7aae5173c4e6743f0957cd76e3add4.jpeg)
其中 Cw 和 Ch 是覆盖两个Box的最小外接框的宽度和高度。
考虑到BBox的回归中也存在训练样本不平衡的问题,即在一张图像中回归误差小的高质量锚框的数量远少于误差大的低质量样本,质量较差的样本会产生过大的梯度影响训练过程。作者在EIOU的基础上结合Focal Loss提出一种Focal EIOU Loss,梯度的角度出发,把高质量的锚框和低质量的锚框分开,惩罚项公式如下:
![v2-279518adff451c30fd2b744858718acd_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/c5e2b993b1a86d04121ef24ffda97751.png)
其中IOU = |A∩B|/|A∪B|, γ为控制异常值抑制程度的参数。该损失中的Focal与传统的Focal Loss有一定的区别,传统的Focal Loss针对越困难的样本损失越大,起到的是困难样本挖掘的作用;而根据上述公式:IOU越高的损失越大,相当于加权作用,给越好的回归目标一个越大的损失,有助于提高回归精度。
2. 存在的问题
本文针对边界框回归任务,在之前基于CIOU损失的基础上提出了两个优化方法:
- 将纵横比的损失项拆分成预测的宽高分别与最小外接框宽高的差值,加速了收敛提高了回归精度;
- 引入了Focal Loss优化了边界框回归任务中的样本不平衡问题,即减少与目标框重叠较少的大量锚框对BBox 回归的优化贡献,使回归过程专注于高质量锚框。
不足之处或许在于Focal的表达形式是否有待改进。
六、IOU、GIOU、DIOU、CIOU、EIOU对比
边界框回归的三大几何因素:重叠面积、中心点距离、纵横比
- IOU Loss:考虑了重叠面积,归一化坐标尺度;
- GIOU Loss:考虑了重叠面积,基于IOU解决边界框不相交时loss等于0的问题;
- DIOU Loss:考虑了重叠面积和中心点距离,基于IOU解决GIOU收敛慢的问题;
- CIOU Loss:考虑了重叠面积、中心点距离、纵横比,基于DIOU提升回归精确度;
- EIOU Loss:考虑了重叠面积,中心点距离、长宽边长真实差,基于CIOU解决了纵横比的模糊定义,并添加Focal Loss解决BBox回归中的样本不平衡问题。
IOU Loss | GIOU Loss | DIOU Loss | CIOU Loss | EIOU Loss | |
---|---|---|---|---|---|
优点 | IOU算法是目标检测中最常用的指标,具有尺度不变性,满足非负性;同一性;对称性;三角不等性等特点。 | GIOU在基于IOU特性的基础上引入最小外接框解决检测框和真实框没有重叠时loss等于0问题。 | DIOU在基于IOU特性的基础上考虑到GIOU的缺点,直接回归两个框中心点的欧式距离,加速收敛。 | CIOU就是在DIOU的基础上增加了检测框尺度的loss,增加了长和宽的loss,这样预测框就会更加的符合真实框。 | EIOU在CIOU的基础上分别计算宽高的差异值取代了纵横比,同时引入Focal Loss解决难易样本不平衡的问题。 |
缺点 | 1.如果两个框不相交,不能反映两个框距离远近 2.无法精确的反映两个框的重合度大小 | 1.当检测框和真实框出现包含现象的时候GIOU退化成IOU 2.两个框相交时,在水平和垂直方向上收敛慢 | 回归过程中未考虑Bounding box的纵横比,精确度上尚有进一步提升的空间 | 1. 纵横比描述的是相对值,存在一定的模糊 2. 未考虑难易样本的平衡问题 | 待定 |
图像和点云处理交流群
已建立图像和点云处理微信交流群!想要进群学习交流的同学可以直接加微信号:chenwei000429。加的时候备注一下:图像和点云处理。然后就可以拉您进群了。
推荐大家关注小编知乎账号和微信公众号,方便了解自动驾驶领域图像和点云的感知算法。
![v2-521c993f032a01ddd7e7e4aeeb8c0b61_b.jpg](https://i-blog.csdnimg.cn/blog_migrate/d3737d82e719c0d351fc7397a1195460.png)
▲长按关注公众号