yolov3 build_targets函数

最新推荐文章于 2023-06-29 22:39:23 发布

Dandelion_2

最新推荐文章于 2023-06-29 22:39:23 发布

阅读量2.3k

点赞数 1

分类专栏：深度学习 pytorch 文章标签： pytorch 深度学习 python

本文链接：https://blog.csdn.net/dandelion_2/article/details/121576424

版权

pytorch 同时被 2 个专栏收录

10 篇文章 0 订阅

订阅专栏

深度学习

6 篇文章 0 订阅

订阅专栏

def build_targets(p, targets, model):
    # p的shape(batch_size, anchor_num, grid_y, grid_x, xywh+obj_confidence+cls_num)
    # targets的shape为（num_groundtruth, 6), 其中数字6代表img_index+cls_index+xywh
    # model为整个yolo的model，以获取当前model对应YoloLayer的信息和YoloLayer对应的anchor尺度

    # Build targets for compute_loss(), input targets(image,class,x,y,w,h)

    # nt获取targets第一个维度num_groundtruth的数值
    nt = targets.shape[0]
    # tcls为筛选后gt的类索引
    # tbox为筛选后的gt的box信息， 包含了tx, ty, w, h信息, 其中tx和ty为gt的中心坐标
    # indices包含了tcls以及tbox信息的图像索引、所用的anchor索引、以及gt所在的grid_cell信息， shape为(image_index, anchor_index, grid_y, grid_x)
    # anch为每个gt对应使用的anchor尺度
    # 这4个参数也是build_targets返回的参数
    tcls, tbox, indices, anch = [], [], [], []
    # 前面提到buildd_targets的输入参数p的作用，这里gain的作用就是将输入参数p的shape转化为tensor，
    # 后面会提到gain的操作，这里只是对gain进行初始化，初始化为一个6维都为数值1的tensor
    gain = torch.ones(6, device=targets.device)  # normalized to gridspace gain

    multi_gpu = type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel)
    # model.yolo_layers参数为model的成员变量，定义为yolo层索引list=[89, 101, 113]
    for i, j in enumerate(model.yolo_layers):
        # 获取该yolo predictor对应的anchors
        anchors = model.module.module_list[j].anchor_vec if multi_gpu else model.module_list[j].anchor_vec
        # gain操作：p[i]表示第i个yolo_layer的输出，shape为(bs, anchor, grid_x, grid_y+obj_confidence_cls_num)
        gain[2:] = torch.tensor(p[i].shape)[[3, 2, 3, 2]]  # xyxy gain
        # na获取anchors第一个维度的值，即anchors的数量3
        na = anchors.shape[0]  # number of anchors
        # at生成一个shape为(3, nt), 其中(1, nt)的值都为0， (2, nt)的值都为1，(3, nt)的值都为2
        # [3] -> [3, 1] -> [3, nt]
        at = torch.arange(na).view(na, 1).repeat(1, nt) # anchor tensor, same as .repeat_interleave(nt)

        # # gt恢复到feature map尺度
        # Match targets to anchors
        # a作target使用的anchor索引用
        # gain的状态为[1., 1., grid_y, grid_x, grid_y, grid_x]
        # targets的shape为(num_groundtruth, img_index+cls_index+x+y+w+h)
        # gain与targets的num_groundtruth个维度[img_index, cls_index, x, y, w, h]tensor进行逐元素相乘，
        # 可将targets中所有gt的x, y, w, h恢复到当前yolo_layer的feature_map尺度上
        # 后面用nt表示num_groundtruth
        # offsets默认为0，在获取当前gt所在的grid_cell左上角坐标时会用到，但该函数offsets的设置一直为0，并没有什么作用
        a, t, offsets = [], targets * gain, 0
        if nt:  # 如果存在target的话
            # iou_t = 0.20
            # j: [3, nt]
            # 传入anchors尺度，t为shape为（228，6）的tensor，取(228,(4,5))这个tensor传入，即w,h尺度
            # j是布尔值，大于0.2返回true，否则返回false，表示每组anchor和target的wh尺度，wh_iou表示宽高iou
            j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t']  # iou(3,n) = wh_iou(anchors(3,2), gwh(n,2))
            # t.repeat(na, 1, 1): [nt, 6] -> [3, nt, 6]
            # 获取iou大于阈值的anchor与target对应信息
            # a前面已经说过，表示筛选后的gt的anchor索引
            # at生成一个shape为(3, nt), 其中(1, nt)的值都为0， (2, nt)的值都为1，(3, nt)的值都为2
            # 注意：这里的筛选规则是，gt的box和其中一个anchor的尺度满足大于iou_t就被筛选上。
            # at[j]的作用：将j的三个维度中满足True的anchor索引筛选出来，对应的a的索引表示gt的索引。
            # t = t.repeat(na, 1, 1)[j]作用：前面提过t的shape为(nt, 6)，即(nt, img_index + cls_index + x + y + w + h)
            # t.repeat(na, 1, 1)[j]之后，筛选得到gt与a的索引对应
            # 最终 t 的shape为(final_gt_num, 6)，即(final_gt, img_index + cls_index + x + y + w + h)
            tt = t.repeat(na, 1, 1)
            a, t = at[j], t.repeat(na, 1, 1)[j]  # filter

        # Define
        # long等于to(torch.int64), 数值向下取整
        # t[:, :2].long().T对 t = (nt, img_index+cls_index+x+y+w+h)的第二个维度开始筛选前两个值，即img_index和cls_index
        # b和c均为Tensor(final_gt_num,)均包含了final_gt_num个值，0 <= final_gt_num <= 3 × nt
        # 记住，这里的t是已经恢复到yolo_layer的feature map尺度的tensor了
        # gxy = t[:, 2:4]：gxy获取x和y坐标,gwh同理
        # gij = (gxy - offsets).long() 这里offsets为0，等于没有使用到，
        # 这里long()函数将gxy向下取整，刚好就能得到当前gt的所在的grid的左上角坐标
        # 所以，tx ∈ [gi,gi+1]，ty∈[gj,gj+1] gx = gi + tx经过long()函数之后，tx被消除，剩下的gi即当前gt的grid坐标
        b, c = t[:, :2].long().T  # image, class
        gxy = t[:, 2:4]  # grid xy
        gwh = t[:, 4:6]  # grid wh
        gij = (gxy - offsets).long()  # 匹配targets所在的grid cell左上角坐标
        gi, gj = gij.T  # grid xy indices

        # Append
        # indices：(YoloLayer_num,img_index+anchor_index+grid_y+grid_x)
        # tbox：筛选出来的gt的box信息，tx, ty, w, h。其中tx, ty是偏移量；w, h是宽高,
        # shape(YoloLayer_num, img_index+anchor_index+grid_y+grid_x)
        # tcls：筛选出来的gt的类索引，shape(YoloLayer_num,targets_num)
        # anch：每个target对应使用的anchor尺度,shape(YoloLayer_num,targets_num,wh)
        indices.append((b, a, gj, gi))  # image, anchor, grid indices(x, y)
        tbox.append(torch.cat((gxy - gij, gwh), 1))  # gt box相对anchor的x,y偏移量以及w,h
        anch.append(anchors[a])  # anchors
        tcls.append(c)  # class
        if c.shape[0]:  # if any targets
            # 目标的标签数值不能大于给定的目标类别数
            assert c.max() < model.nc, 'Model accepts %g classes labeled from 0-%g, however you labelled a class %g. ' \
                                       'See https://github.com/ultralytics/yolov3/wiki/Train-Custom-Data' % (
                                           model.nc, model.nc - 1, c.max())

    return tcls, tbox, indices, anch

Dandelion_2

关注

1
点赞
踩
6

收藏

觉得还不错? 一键收藏
1
评论
yolov3 build_targets函数

def build_targets(p, targets, model): # p的shape(YoloLayer_num, batch_size, anchor_num, grid_y, grid_x, xywh+obj_confidence+cls_num) # targets的shape为（num_groundtruth, 6), 其中数字6代表img_index+cls_index+xywh # model为整个yolo的model，以获取当前model对应YoloLayer
复制链接

扫一扫