Keras-Yolo v3 代码对应含义

最新推荐文章于 2023-11-23 19:59:50 发布

lalahappy

最新推荐文章于 2023-11-23 19:59:50 发布

阅读量443

点赞数

文章标签： keras 深度学习 python

本文链接：https://blog.csdn.net/qq_42563807/article/details/122041593

版权

在这里插入图片描述

锚框只有尺寸（相较于三种特征图），并无位置关系；每个cell都有三个不同尺寸的锚框；
每个cell预测3个bbox; bbox=x, y, w, h, confidence score, number_class

yolov3要先build target，对于某个ground truth，首先要确定其中心点要落在哪个cell上，然后计算这个cell的每个anchor与ground truth的IOU值，计算IOU值时不考虑坐标，只考虑形状(因为anchor没有坐标xy信息)；

所以先将anchor与ground truth的中心点都移动到同一位置（原点），然后计算出对应的IOU值，IOU值最大的那个先验框anchor与ground truth匹配作为正样本参与训练；

那么正样本应该如何找？
label中存放着[image, class, x(归一化), y, w(归一化), h]，我们可以用这些坐标在对应13×13 Or 26×26 or 52×52的map中分别于9个anchor算出iou，找到符合要求的，把索引与位置记录好。

用记录好的索引位置找到predict的anchor box。

对应的预测框用来预测这个ground truth；
Namely: 利用先验框与输出的特征图，解锁真实的检测框，真实的预测框与真实框进行回归；

Faster RCNN，提取特征，RPN生成3种大小，3种比例的框，后处理（与gt回归？），映射到卷积层中，然后进行锚框（此时的锚框相当于预测框了吧）与GT的回归；
所以这是一阶段还是两阶段的区别吗？
庞老师说：
二阶段在RPN网络还有一个筛选，进入到第二阶段的锚框数量少。
二阶段就是多了一个专门的锚框生成和初步筛选的阶段
在这里插入图片描述

所谓的[5, 4, 2] 应当是指的 5行4列的cell, 第二个anchor；

pred_yolo_1 = _conv_block(x, [{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True,  'leaky': True,  'layer_idx': 80}, 
                              {'filter': (3*(5+nb_class)), 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 81}], do_skip=False)

#最后一层的filter: 3×(5+nb_classs))
#一个grid 预测3个bbox---每个bbox预测：x, y, w, h, confidencess score, and number of classes

true_yolo_1 = Input(shape=(None, None, len(anchors)//6, 4+1+nb_class)) 
                                         # grid_h, grid_w, nb_anchor, 5+nb_class

例如：

true_yolo_1 = np.array([[[[x, y, w, h, c_score, class_1, class_2],
                          [x, y, w, h, c_score, class_1, class_2],
                          [x, y, w, h, c_score, class_1, class_2]],
                
                         [[1, 2, 3, 4, 5, 6, 7],
                          [1, 2, 3, 4, 5, 6, 7],
                          [1, 2, 3, 4, 5, 6, 7]]],
                          
                        [[[1, 2, 3, 4, 5, 6, 7],
                          [1, 2, 3, 4, 5, 6, 7],
                          [1, 2, 3, 4, 5, 6, 7]],
                          
                         [[1, 2, 3, 4, 5, 6, 7],
                          [1, 2, 3, 4, 5, 6, 7],
                          [1, 2, 3, 4, 5, 6, 7]]]]) 
# true_yolo_1.shape(2,2,3,5+2)                          
#   s×s=2×2 每个grid有3个bbox, 每个bbox预测5+2个参数

true_boxes  = Input(shape=(1, 1, 1, max_box_per_image, 4))

true_boxes = np.array([[[[[1, 2, 3, 4],
                          [1, 2, 3, 4]]]]])        
#true_boxes.shape(1,1,1,2,4)
#每张图片最多有2个真实标注的bbox， 每个框需要预测4个参数；

loss_yolo_1 = YoloLayer(-------------)([input_image, pred_yolo_1, true_yolo_1, true_boxes])
                                       #input_image, y_pred,      y_true,      true_boxes = x

lalahappy

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Keras-Yolo v3 代码对应含义

pred_yolo_1 = _conv_block(x, [{'filter': 1024, 'kernel': 3, 'stride': 1, 'bnorm': True, 'leaky': True, 'layer_idx': 80}, {'filter': (3*(5+nb_class)), 'kernel': 1, 'stride': 1, 'bnorm': False, 'leaky': False, 'layer_idx': 81
复制链接

扫一扫