yolov3代码详解_目标检测--YOLOV3（附TensorFlow代码详解）

最新推荐文章于 2024-06-21 21:31:57 发布

weixin_39574140

最新推荐文章于 2024-06-21 21:31:57 发布

阅读量781

点赞数 1

文章标签： yolov3代码详解

本文链接：https://blog.csdn.net/weixin_39574140/article/details/113326280

版权

YOLOv3代码详解：

一、预测过程：

1.网络结构的定义：

网络最后得到的detect_1,detect_2,detect_3.

三个尺度的形状分别为：[1, 507（13X13X3）, 5+c]、[1, 2028, 5+c]、[1, 8112, 5+c]

其中Yolo_block是一个正常卷积(不改变图像大小)组成的模块，生成route和inputs两个结果，route 用于配合下一个尺度的特征一起计算，返回值inputs用于输入检测层进行bbox_atrrs单元的计算。

detect_layer为检测层，这里用到了候选框，来自于训练数据集的样本。在模型训练时，需要对数据集的标注样本进行聚类分析，得到具体尺寸，代表目标样本中最常见的尺寸，在训练或者模型测试时，将尺寸数据作为先验知识一起放到模型里，可以提高模型准确率。

#定义候选框，来自coco数据集
_ANCHORS = [(10., 13.), (16., 30.), (33., 23.), (30., 61.), (62., 45.), (59., 119.), (116., 90.), (156., 198.), (373., 326.)]

这一部分需要重点理解：

候选框的个数怎么定，其数量的大小对结果有怎样的影响？

def _detection_layer(inputs,num_classes,anchors,img_size,data_format):
    print(inputs.get_shape())
    """
    得到通道数为num_anchors*(5+num_classes)，大小为h*w的预测结果
    """
    num_anchors = len(anchors) #候选框个数
    predictions = slim.conv2d(inputs,num_anchors*(5+num_classes),1,stride=1,
                            normalizer_fn = None, activation_fn=None,
                            biases_initializer = tf.zeros_initializer())
    shape = predictions.get_shape().as_list() # [batch, H, W, C]  C = num_anchors*(5+num_classes)
    print("shape",shape)#三个尺度的形状分别为：[1, 13, 13, 3*(5+c)]、[1, 26, 26, 3*(5+c)]、[1, 52, 52, 3*(5+c)]
    grid_size = shape[1:3] #去 NHWC中的HW
    dim = grid_size[0] * grid_size[1]#每个格子所包含的像素 h*W
    bbox_attrs = 5 + num_classes
    
    """
    因为最终要得到每个候选框里每个像素的bbox_attrs，所以先reshape成 num_anchors * dim, bbox_attrs格式，然后对bbox_attrs按照2,2,1,num_classes的格式进行单元属性的拆分
    """
    #把h和w展开成dim [batch,num_anchors * dim,5+num_classes]
    predictions = tf.reshape(predictions, [-1, num_anchors * dim, bbox_attrs])
    stride = (img_size[0] // grid_size[0], img_size[1] // grid_size[1])#缩放参数 32（416/13）
    anchors = [(a[0] / stride[0], a[1] / stride[1]) for a in anchors]#将候选框的尺寸同比例缩小
    #将含边框的单元属性拆分
    box_centers,box_sizes,confidence,classes = tf.split(predictions,[2,2,1,num_classes],axis=-1)
    
    """
    对拆分的单元属性进行一一求解，其中box_centers和box_sizes要映射到原始图上的值的大小，最后将求解的属性值再合并起来
    """   
    box_centers = tf.nn.sigmoid(box_centers)
    confidence = tf.nn.sigmoid(confidence)
    
    grid_x = tf.range(grid_size[0],dtype = tf.float32) #定义网格索引0,1,2....n ，shape = (1,13)
    grid_y = tf.range(grid_size[1],dtype = tf.float32) #定义网格索引0,1,2....m,  shape = (1,13)
    
    a, b = tf.meshgrid(grid_x, grid_y)#生成网格矩阵 a0，a1.。。an（共M行）  ， b0，b0，。。。b0（共n个），第二行为b1
    
    x_offset = tf.reshape(a,(-1,1))  
    y_offset = tf.reshape(b,(-1,1))
    
    x_y_offset = tf.concat([x_offset, y_offset], axis=-1)#连接----[dim,2]
    x_y_offset = tf.reshape(tf.tile(x_y_offset, [1, num_anchors]), [1, -1, 2])#按候选框的个数复制xy（【1，n】代表第0维一次，第1维n次）
    
    box_centers = box_centers + x_y_offset#box_centers为0-1，x_y为具体网格的索引，相加后，就是真实位置(0.1+4=4.1，第4个网格里0.1的偏移)
    box_centers = box_centers * stride#真实尺寸像素点

    anchors = tf.tile(anchors, [dim, 1])
    box_sizes = tf.exp(box_sizes) * anchors#计算边长：hw
    box_sizes = box_sizes * stride#真实边长

    detections = tf.concat([box_centers, box_sizes, confidence], axis=-1)
    classes = tf

最低0.47元/天解锁文章

weixin_39574140

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
yolov3代码详解_目标检测--YOLOV3（附TensorFlow代码详解）

YOLOv3代码详解：一、预测过程：1.网络结构的定义：网络最后得到的detect_1,detect_2,detect_3.三个尺度的形状分别为：[1, 507（13X13X3）, 5+c]、[1, 2028, 5+c]、[1, 8112, 5+c]其中Yolo_block是一个正常卷积(不改变图像大小)组成的模块，生成route和inputs两个结果，route 用于配合下一个尺度的特征一起计算...
复制链接

扫一扫