YOLO细节

最新推荐文章于 2024-07-22 12:50:02 发布

Hpatron

最新推荐文章于 2024-07-22 12:50:02 发布

阅读量191

点赞数 1

分类专栏：深度学习文章标签：神经网络深度学习 tensorflow

本文链接：https://blog.csdn.net/qq_35513792/article/details/104861707

版权

深度学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

YOLOv3采用了Faster RCNN的样本挖掘策略，将样本划分为3种状态，正样本、负样本、ignore。

在检测器的训练过程，将非正样本与正样本进行IOU计算，如果非正样本与每一个正样本的IOU都不超过指定阈值(论文中设置为0.5)就设置为负样本状态，其他样本则成为ignore状态。

对于正样本，YOLOv3和YOLO一样。YOLO划分网格本身就是一种先验辅助，目标的中心落入哪一个格子里就由那个格子训练检测器。但是每个格子有3个anchor作出的检测框，怎么把一个真实目标框分配给3个anchor作出的检测框作监督呢？

看了YOLOv3 tensorflow的代码，发现在读取数据集的时候就事先进行了处理

def process_box(boxes, labels, img_size, class_num, anchors):
    '''
    Generate the y_true label, i.e. the ground truth feature_maps in 3 different scales.
    params:
        boxes: [N, 5] shape, float32 dtype. `x_min, y_min, x_max, y_mix, mixup_weight`.
        labels: [N] shape, int64 dtype.
        class_num: int64 num.
        anchors: [9, 4] shape, float32 dtype.
    '''
    anchors_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]]

    # convert boxes form:
    # shape: [N, 2]
    # (x_center, y_center)
    box_centers = (boxes[:, 0:2] + boxes[:, 2:4]) / 2
    # (width, height)
    box_sizes = boxes[:, 2:4] - boxes[:, 0:2]

    # [13, 13, 3, 5+num_class+1] `5` means coords and labels. `1` means mix up weight. 
    y_true_13 = np.zeros((img_size[1] // 32, img_size[0] // 32, 3, 6 + class_num), np.float32)
    y_true_26 = np.zeros((img_size[1] // 16, img_size[0] // 16, 3, 6 + class_num), np.float32)
    y_true_52 = np.zeros((img_size[1] // 8, img_size[0] // 8, 3, 6 + class_num), np.float32)

    # mix up weight default to 1.
    y_true_13[..., -1] = 1.
    y_true_26[..., -1] = 1.
    y_true_52[..., -1] = 1.

    y_true = [y_true_13, y_true_26, y_true_52]

    # [N, 1, 2]
    box_sizes = np.expand_dims(box_sizes, 1)
    # broadcast tricks
    # [N, 1, 2] & [9, 2] ==> [N, 9, 2]
    mins = np.maximum(- box_sizes / 2, - anchors / 2)
    maxs = np.minimum(box_sizes / 2, anchors / 2)
    # [N, 9, 2]
    whs = maxs - mins

    # [N, 9]
    iou = (whs[:, :, 0] * whs[:, :, 1]) / (
                box_sizes[:, :, 0] * box_sizes[:, :, 1] + anchors[:, 0] * anchors[:, 1] - whs[:, :, 0] * whs[:, :,
                                                                                                         1] + 1e-10)
    # [N]
    best_match_idx = np.argmax(iou, axis=1)

    ratio_dict = {1.: 8., 2.: 16., 3.: 32.}
    for i, idx in enumerate(best_match_idx):
        # idx: 0,1,2 ==> 2; 3,4,5 ==> 1; 6,7,8 ==> 0
        feature_map_group = 2 - idx // 3
        # scale ratio: 0,1,2 ==> 8; 3,4,5 ==> 16; 6,7,8 ==> 32
        ratio = ratio_dict[np.ceil((idx + 1) / 3.)]
        x = int(np.floor(box_centers[i, 0] / ratio))
        y = int(np.floor(box_centers[i, 1] / ratio))
        k = anchors_mask[feature_map_group].index(idx)
        c = labels[i]
        # print(feature_map_group, '|', y,x,k,c)

        y_true[feature_map_group][y, x, k, :2] = box_centers[i]
        y_true[feature_map_group][y, x, k, 2:4] = box_sizes[i]
        y_true[feature_map_group][y, x, k, 4] = 1.
        y_true[feature_map_group][y, x, k, 5 + c] = 1.
        y_true[feature_map_group][y, x, k, -1] = boxes[i, -1]

    return y_true_13, y_true_26, y_true_52

在这段代码中将真实目标框和anchor进行IOU计算，这里的计算只涉及到宽高，就是为了找出哪一个anchor的尺寸和目标框最相似，然后在对应的anchor位置上标记好，其他不相似的anchor就不作为正样本监督了。

Hpatron

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
YOLO细节

YOLOv3采用了Faster RCNN的样本挖掘策略，将样本划分为3种状态，正样本、负样本、ignore。在检测器的训练过程，将非正样本与正样本进行IOU计算，如果非正样本与每一个正样本的IOU都不超过指定阈值(论文中设置为0.5)就设置为负样本状态，其他样本则成为ignore状态。对于正样本，YOLOv3和YOLO一样。YOLO划分网格本身就是一种先验辅助，正样本的中心落入哪一个格子里...
复制链接

扫一扫

专栏目录