boxlist_iou

最新推荐文章于 2024-07-25 15:39:18 发布

呆呆囧想学C++

最新推荐文章于 2024-07-25 15:39:18 发布

阅读量360

点赞数 8

分类专栏：代码笔记文章标签：笔记

本文链接：https://blog.csdn.net/m0_46011550/article/details/135209719

版权

代码笔记专栏收录该内容

3 篇文章 0 订阅

订阅专栏

源代码

def boxlist_iou(boxlist1, boxlist2):
    """Compute the intersection over union of two set of boxes.
    The box order must be (xmin, ymin, xmax, ymax).

    Arguments:
      box1: (BoxList) bounding boxes, sized [N,4].
      box2: (BoxList) bounding boxes, sized [M,4].

    Returns:
      (tensor) iou, sized [N,M].

    Reference:
      https://github.com/chainer/chainercv/blob/master/chainercv/utils/bbox/bbox_iou.py
    """
    if boxlist1.size != boxlist2.size:
        raise RuntimeError(
                "boxlists should have same image size, got {}, {}".format(boxlist1, boxlist2))
    boxlist1 = boxlist1.convert("xyxy")
    boxlist2 = boxlist2.convert("xyxy")
    N = len(boxlist1)
    M = len(boxlist2)

    area1 = boxlist1.area()
    area2 = boxlist2.area()

    box1, box2 = boxlist1.bbox, boxlist2.bbox

    lt = torch.max(box1[:, None, :2], box2[:, :2])  # [N,M,2]  N中的一个和M中的比较，所以有N,M个
    
    #PyTorch 会在 box1[:, None, :2] 的第二个维度上插入一个新的维度，使其形状变为 (N, 1, 2)，
    # 然后与 box2[:, :2] 的形状 (M, 2) 匹配。
    #最终的形状是 (N, M, 2)，其中 N 和 M 分别是两组边界框的数量。
    ## lt = tensor([[[2., 2.],  # box1第一个边界框的左上角坐标遍历比较box2边界框的左上角坐标的最大值
         #           [2., 2.]],
        #          [[3., 3.],  # box1 第二个边界框的左上角坐标遍历比较box2边界框的左上角坐标的最大值
        #           [3., 3.]]])
    
    rb = torch.min(box1[:, None, 2:], box2[:, 2:])  # [N,M,2]  PyTorch 会尝试使用广播（broadcasting）来使它们的形状相容

    TO_REMOVE = 1

    wh = (rb - lt + TO_REMOVE).clamp(min=0)  # [N,M,2]   小于0的为0,当计算为0时，面积也会为0
    inter = wh[:, :, 0] * wh[:, :, 1]  # [N,M]   wh=[[[20, 20], [10, 15]], [[15, 15], [0, 0]]] 张量表示了两组边界框之间每对边界框的宽度和高度的交集。

    iou = inter / (area1[:, None] + area2 - inter)
    return iou

举例说明

# 示例数据
rb = torch.tensor([[[8, 8], [12, 12]],  # 右下角坐标，第一组边界框
                  [[10, 10], [15, 15]]]) # 右下角坐标，第二组边界框

lt = torch.tensor([[[2, 2], [5, 5]],    # 左上角坐标，第一组边界框
                  [[3, 3], [8, 8]]])    # 左上角坐标，第二组边界框

TO_REMOVE = 1

# 计算宽度和高度，小于0的值设为0
wh = (rb - lt + TO_REMOVE).clamp(min=0)
print(wh)

# 计算交集面积
inter = wh[:, :, 0] * wh[:, :, 1]

# 打印结果
print(inter)

tensor([[[7, 7],
         [8, 8]],

        [[8, 8],
         [8, 8]]])
tensor([[49, 64],
        [64, 64]])

迷惑点

max函数为什么增加1维? 为了在比较大小时逐个比较，增加1维为了，max广播机制存入比较后的值有地方存放，比如比较（m,2）(n,2),最后逐个比较m个数和n个数比较自然就是（m,n,2）
clamp 函数用于对张量进行截断操作，即限制张量的值在一个指定范围内。比如最小值为0
左后iou加入没有交集，自然inter为0，分子为0，结果等于0