YOLO源码解析,有点用,可惜是tensorflow写的,还是想多接触pytorch的代码

YOLO是基于深度学习的端到端的实时目标检测系统。与大部分目标检测与识别方法(比如Fast R-CNN)将目标识别任务分类目标区域预测和类别预测等多个流程不同,YOLO将目标区域预测和目标类别预测整合于单个神经网络模型中,实现在准确率较高的情况下快速目标检测与识别,更加适合现场应用环境。详情请参见: YOLO:实时快速目标检测YOLO升级版:YOLOv2和YOLO9000解析。本文将对YOLO的tensorflow实现代码进行详解。本文使用的YOLO源码来源于 hizhangp/ yolo_tensorflow

本文结构如下:一,YOLO代码概况;二,train解析;三,test概括;四,总结

1 YOLO代码概况

源代码文件构成如图1-1所示。train.py为训练代码,test.py为测试代码,其它文件夹内的代码为设定参数,建立网络,读取数据等辅助代码。

图1-1 YOLO源代码文件夹

2 train解析

从main()方法,首先读取参数;其次建立YOLONet;然后读取训练数据;最后进行训练。


2.1 建立YOLONet

YOLONet的建立是通过 yolo文件夹中的yolo_net.py文件的代码实现了。yolo_net.py定义了YOLONet类,该类包含了网络初始化(__init__()),建立网络(build_networks())和loss函数(loss_layer())等方法。

网络的所有初始化参数包含于__init__()方法之中。

    def __init__(self, phase):
        self.weights_file = cfg.WEIGHTS_FILE#权重文件
        self.classes = cfg.CLASSES#类别
        self.num_class = len(self.classes)#类别数量,值为20
        self.image_size = cfg.IMAGE_SIZE#图像尺寸,值为448
        self.cell_size = cfg.CELL_SIZE#cell尺寸,值为7
        self.boxes_per_cell = cfg.BOXES_PER_CELL#每个grid cell负责的boxes,默认为2
        self.output_size = (self.cell_size * self.cell_size) * \
            (self.num_class + self.boxes_per_cell * 5)#输出尺寸
        self.scale = 1.0 * self.image_size / self.cell_size
        self.boundary1 = self.cell_size * self.cell_size * self.num_class#7×7×20
        self.boundary2 = self.boundary1 + self.cell_size * \
            self.cell_size * self.boxes_per_cell#7×7×20+7×7×2
        self.object_scale = cfg.OBJECT_SCALE#值为1
        self.noobject_scale = cfg.NOOBJECT_SCALE#值为1
        self.class_scale = cfg.CLASS_SCALE#值为2.0
        self.coord_scale = cfg.COORD_SCALE#值为5.0
        self.learning_rate = cfg.LEARNING_RATE#学习速率LEARNING_RATE = 0.0001
        self.batch_size = cfg.BATCH_SIZE#BATCH_SIZE = 45
        self.alpha = cfg.ALPHA#ALPHA = 0.1
        self.disp_console = cfg.DISP_CONSOLE#DISP_CONSOLE = False
        self.phase = phase#train or test
        self.collection = []#用于储存网络参数
        self.offset = np.transpose(np.reshape(np.array(
            [np.arange(self.cell_size)] * self.cell_size * self.boxes_per_cell),
            (self.boxes_per_cell, self.cell_size, self.cell_size)), (1, 2, 0))#偏置
    <span class="bp">self</span><span class="o">.</span><span class="n">build_networks</span><span class="p">()</span>

网络建立是通过build_networks()方法实现的,网络由卷积层-pooling层和全连接层组成,详细结构请参见源代码和YOLO:实时快速目标检测。网络接受输入维度为([None, 448, 448, 3]),输出维度为([None,1470])。

loss函数代码的关键,loss函数定义为:

(参加: YOLO:实时快速目标检测

loss函数是通过loss_layer()实现,代码注释对各个变量的shape进行了注释,结果如下。

#计算iou
def calc_iou(self, boxes1, boxes2):
“”“calculate ious
Args:
boxes1: 4-D tensor [CELL_SIZE, CELL_SIZE, BOXES_PER_CELL, 4] ====> (x_center, y_center, w, h)
boxes2: 1-D tensor [CELL_SIZE, CELL_SIZE, BOXES_PER_CELL, 4] ===> (x_center, y_center, w, h)
Return:
iou: 3-D tensor [CELL_SIZE, CELL_SIZE, BOXES_PER_CELL]
“””
boxes1 = tf.pack([boxes1[:, :, :, :, 0] - boxes1[:, :, :, :, 2] / 2.0,
boxes1[:, :, :, :, 1] - boxes1[:, :, :, :, 3] / 2.0,
boxes1[:, :, :, :, 0] + boxes1[:, :, :, :, 2] / 2.0,
boxes1[:, :, :, :, 1] + boxes1[:, :, :, :, 3] / 2.0])
boxes1 = tf.transpose(boxes1, [1, 2, 3, 4, 0])
    <span class="n">boxes2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">pack</span><span class="p">([</span><span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">-</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">]</span> <span class="o">/</span> <span class="mf">2.0</span><span class="p">,</span>
                      <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">]</span> <span class="o">-</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">]</span> <span class="o">/</span> <span class="mf">2.0</span><span class="p">,</span>
                      <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">]</span> <span class="o">/</span> <span class="mf">2.0</span><span class="p">,</span>
                      <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">]</span> <span class="o">/</span> <span class="mi">2</span><span class="p">])</span>
    <span class="n">boxes2</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="n">boxes2</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>

    <span class="c1"># calculate the left up point &amp; right down point</span>
    <span class="n">lu</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span><span class="n">boxes1</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:</span><span class="mi">2</span><span class="p">],</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:</span><span class="mi">2</span><span class="p">])</span>
    <span class="n">rd</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">minimum</span><span class="p">(</span><span class="n">boxes1</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">:],</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">:])</span>

    <span class="c1"># intersection</span>
    <span class="n">intersection</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">rd</span> <span class="o">-</span> <span class="n">lu</span><span class="p">)</span>
    <span class="n">inter_square</span> <span class="o">=</span> <span class="n">intersection</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="n">intersection</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">]</span>

    <span class="c1"># calculate the boxs1 square and boxs2 square</span>
    <span class="n">square1</span> <span class="o">=</span> <span class="p">(</span><span class="n">boxes1</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">]</span> <span class="o">-</span> <span class="n">boxes1</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">])</span> <span class="o">*</span> \
        <span class="p">(</span><span class="n">boxes1</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">]</span> <span class="o">-</span> <span class="n">boxes1</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">])</span>
    <span class="n">square2</span> <span class="o">=</span> <span class="p">(</span><span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">]</span> <span class="o">-</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">])</span> <span class="o">*</span> \
        <span class="p">(</span><span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">]</span> <span class="o">-</span> <span class="n">boxes2</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">])</span>

    <span class="n">union_square</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">maximum</span><span class="p">(</span><span class="n">square1</span> <span class="o">+</span> <span class="n">square2</span> <span class="o">-</span> <span class="n">inter_square</span><span class="p">,</span> <span class="mf">1e-10</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">tf</span><span class="o">.</span><span class="n">clip_by_value</span><span class="p">(</span><span class="n">inter_square</span> <span class="o">/</span> <span class="n">union_square</span><span class="p">,</span> <span class="mf">0.0</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">)</span>

<span class="c1">#loss函数</span>
<span class="c1">#idx=33,predicts为fc_32,labels shape为(45, 7, 7, 25)</span>
<span class="c1">#self.loss = self.loss_layer(33, self.fc_32, self.labels)</span>
<span class="k">def</span> <span class="nf">loss_layer</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">idx</span><span class="p">,</span> <span class="n">predicts</span><span class="p">,</span> <span class="n">labels</span><span class="p">):</span>

    <span class="c1">#将网络输出分离为类别和定位以及box大小,输出维度为7*7*20+7*7*2+7*7*2*4=1470</span>
    <span class="c1">#类别,shape为(45, 7, 7, 20)</span>
    <span class="n">predict_classes</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">predicts</span><span class="p">[:,</span> <span class="p">:</span><span class="bp">self</span><span class="o">.</span><span class="n">boundary1</span><span class="p">],</span>
        <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">batch_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">num_class</span><span class="p">])</span>
    <span class="c1">#定位,shape为(45, 7, 7, 2)</span>
    <span class="n">predict_scales</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">predicts</span><span class="p">[:,</span> <span class="bp">self</span><span class="o">.</span><span class="n">boundary1</span><span class="p">:</span><span class="bp">self</span><span class="o">.</span><span class="n">boundary2</span><span class="p">],</span>
        <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">batch_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">boxes_per_cell</span><span class="p">])</span>
    <span class="c1">##box大小,长宽等 shape为(45, 7, 7, 2, 4)</span>
    <span class="n">predict_boxes</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">predicts</span><span class="p">[:,</span> <span class="bp">self</span><span class="o">.</span><span class="n">boundary2</span><span class="p">:],</span>
        <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">batch_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">boxes_per_cell</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>

    <span class="c1">#label的类别结果,shape为(45, 7, 7, 1)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">labels</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">],</span>
        <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">batch_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
    <span class="c1">#label的定位结果,shape为(45, 7, 7, 1, 4)</span>
    <span class="n">boxes</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">labels</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">:</span><span class="mi">5</span><span class="p">],</span>
        <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">batch_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">])</span>
    <span class="c1">#label的大小结果,shapewei (45, 7, 7, 2, 4)</span>
    <span class="n">boxes</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tile</span><span class="p">(</span><span class="n">boxes</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">boxes_per_cell</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span> <span class="o">/</span> <span class="bp">self</span><span class="o">.</span><span class="n">image_size</span>
    
    <span class="c1">#shape 为(45, 7, 7, 20)</span>
    <span class="n">classes</span> <span class="o">=</span> <span class="n">labels</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">5</span><span class="p">:]</span>

    <span class="c1">#offset shape为(7, 7, 2)</span>
    <span class="n">offset</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">constant</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">offset</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span>
    <span class="c1">#shape为 (1,7, 7, 2)</span>
    <span class="n">offset</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">offset</span><span class="p">,</span>
        <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">boxes_per_cell</span><span class="p">])</span>
    <span class="c1">#shape为(45, 7, 7, 2)</span>
    <span class="n">offset</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">tile</span><span class="p">(</span><span class="n">offset</span><span class="p">,</span> <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">batch_size</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">])</span>
    <span class="c1">#shape为(4, 45, 7, 7, 2)</span>
    <span class="n">predict_boxes_tran</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">pack</span><span class="p">([(</span><span class="n">predict_boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="n">offset</span><span class="p">)</span> <span class="o">/</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span>
                                  <span class="p">(</span><span class="n">predict_boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="n">tf</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="n">offset</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)))</span> <span class="o">/</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span><span class="p">,</span>
                                  <span class="n">tf</span><span class="o">.</span><span class="n">square</span><span class="p">(</span><span class="n">predict_boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">]),</span>
                                  <span class="n">tf</span><span class="o">.</span><span class="n">square</span><span class="p">(</span><span class="n">predict_boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">])])</span>
    <span class="c1">#shape为(45, 7, 7, 2, 4)</span>
    <span class="n">predict_boxes_tran</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="n">predict_boxes_tran</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>

    <span class="c1">#shape为(45, 7, 7, 2)</span>
    <span class="n">iou_predict_truth</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">calc_iou</span><span class="p">(</span><span class="n">predict_boxes_tran</span><span class="p">,</span> <span class="n">boxes</span><span class="p">)</span>

    <span class="c1"># calculate I tensor [BATCH_SIZE, CELL_SIZE, CELL_SIZE, BOXES_PER_CELL]</span>
    <span class="c1">#shape为 (45, 7, 7, 1)</span>
    <span class="n">object_mask</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_max</span><span class="p">(</span><span class="n">iou_predict_truth</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">keep_dims</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="c1">#shape为(45, 7, 7, 2)</span>
    <span class="n">object_mask</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">cast</span><span class="p">((</span><span class="n">iou_predict_truth</span> <span class="o">&gt;=</span> <span class="n">object_mask</span><span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span> <span class="o">*</span> <span class="n">response</span>
    <span class="c1"># mask = tf.tile(response, [1, 1, 1, self.boxes_per_cell])</span>

    <span class="c1"># calculate no_I tensor [CELL_SIZE, CELL_SIZE, BOXES_PER_CELL]</span>
    <span class="c1">#shape为(45, 7, 7, 2)</span>
    <span class="n">noobject_mask</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">ones_like</span><span class="p">(</span><span class="n">object_mask</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">float32</span><span class="p">)</span> <span class="o">-</span> <span class="n">object_mask</span>
    

    <span class="c1">#shape为(4, 45, 7, 7, 2)</span>
    <span class="n">boxes_tran</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">pack</span><span class="p">([</span><span class="n">boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span> <span class="o">-</span> <span class="n">offset</span><span class="p">,</span>
                          <span class="n">boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">]</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">cell_size</span> <span class="o">-</span> <span class="n">tf</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="n">offset</span><span class="p">,</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">)),</span>
                          <span class="n">tf</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">]),</span>
                          <span class="n">tf</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">boxes</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">])])</span>
    <span class="p">(</span><span class="mi">45</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
    <span class="n">boxes_tran</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="n">boxes_tran</span><span class="p">,</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">0</span><span class="p">])</span>

    <span class="c1"># class_loss</span>
    <span class="n">class_loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">reduce_sum</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">square</span><span class="p">(</span><span class="n">response</span> <span class="o">*</span> <span class="p">(</span><span class="n">predict_classes</span> <span class="o">-</span> <span class="n">classes</span><span class="p">)),</span>
        <span class="n">reduction_indices</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s1">'class_loss'</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">class_scale</span>

    <span class="c1"># object_loss</span>
    <span class="n">object_loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">reduce_sum</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">square</span><span class="p">(</span><span class="n">object_mask</span> <span class="o">*</span> <span class="p">(</span><span class="n">predict_scales</span> <span class="o">-</span> <span class="n">iou_predict_truth</span><span class="p">)),</span>
        <span class="n">reduction_indices</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s1">'object_loss'</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">object_scale</span>

    <span class="c1"># noobject_loss</span>
    <span class="n">noobject_loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">reduce_sum</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">square</span><span class="p">(</span><span class="n">noobject_mask</span> <span class="o">*</span> <span class="n">predict_scales</span><span class="p">),</span>
        <span class="n">reduction_indices</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s1">'noobject_loss'</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">noobject_scale</span>

    <span class="c1"># coord_loss</span>
    <span class="c1">#shape 为 (45, 7, 7, 2, 1)</span>
    <span class="n">coord_mask</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">expand_dims</span><span class="p">(</span><span class="n">object_mask</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
    
    <span class="c1">#shape为(45, 7, 7, 2, 4)</span>
    <span class="n">boxes_delta</span> <span class="o">=</span> <span class="n">coord_mask</span> <span class="o">*</span> <span class="p">(</span><span class="n">predict_boxes</span> <span class="o">-</span> <span class="n">boxes_tran</span><span class="p">)</span>
    <span class="n">coord_loss</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">reduce_mean</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">reduce_sum</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">square</span><span class="p">(</span><span class="n">boxes_delta</span><span class="p">),</span>
        <span class="n">reduction_indices</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">]),</span> <span class="n">name</span><span class="o">=</span><span class="s1">'coord_loss'</span><span class="p">)</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">coord_scale</span>

    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">scalar</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/class_loss'</span><span class="p">,</span> <span class="n">class_loss</span><span class="p">)</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">scalar</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/object_loss'</span><span class="p">,</span> <span class="n">object_loss</span><span class="p">)</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">scalar</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/noobject_loss'</span><span class="p">,</span> <span class="n">noobject_loss</span><span class="p">)</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">scalar</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/coord_loss'</span><span class="p">,</span> <span class="n">coord_loss</span><span class="p">)</span>

    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/boxes_delta_x'</span><span class="p">,</span> <span class="n">boxes_delta</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">0</span><span class="p">])</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/boxes_delta_y'</span><span class="p">,</span> <span class="n">boxes_delta</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">1</span><span class="p">])</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/boxes_delta_w'</span><span class="p">,</span> <span class="n">boxes_delta</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">2</span><span class="p">])</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/boxes_delta_h'</span><span class="p">,</span> <span class="n">boxes_delta</span><span class="p">[:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="p">:,</span> <span class="mi">3</span><span class="p">])</span>
    <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">histogram</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">phase</span> <span class="o">+</span> <span class="s1">'/iou'</span><span class="p">,</span> <span class="n">iou_predict_truth</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">class_loss</span> <span class="o">+</span> <span class="n">object_loss</span> <span class="o">+</span> <span class="n">noobject_loss</span> <span class="o">+</span> <span class="n">coord_loss</span></code></pre></div><br><p>2.2 读取数据</p><p>通过utils文件夹中的pascal_voc.py文件读取数据。<br></p><p>2.3 训练</p><p>模型训练包含于train()方法之中。训练部分只需看懂了初始化参数,整个结构就很清晰了。值得注意的地方是在训练过程中,对变量采用了数平均数(exponential moving average (EMA))来提高训练性能,详情见代码注释。同时,运行train.py时,建议将batch_size改小一些(原参数batch size为45,第一次运行没注意,死机了)。</p><div class="highlight"><pre><code class="language-python"><span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">net</span><span class="p">,</span> <span class="n">data</span><span class="p">):</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">net</span> <span class="o">=</span> <span class="n">net</span><span class="c1">#网络</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="c1">#数据</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">weights_file</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">WEIGHTS_FILE</span><span class="c1">#网络权重</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">max_iter</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">MAX_ITER</span><span class="c1">#最大迭代数目</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">initial_learning_rate</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">LEARNING_RATE</span><span class="c1">#学习速率LEARNING_RATE = 0.0001</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">decay_steps</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">DECAY_STEPS</span><span class="c1">#速率延迟步数DECAY_STEPS = 30000</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">decay_rate</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">DECAY_RATE</span><span class="c1">#延迟率DECAY_RATE = 0.1</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">staircase</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">STAIRCASE</span><span class="c1">#???STAIRCASE = True</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">summary_iter</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">SUMMARY_ITER</span><span class="c1">#SUMMARY_ITER = 10</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">save_iter</span> <span class="o">=</span> <span class="n">cfg</span><span class="o">.</span><span class="n">SAVE_ITER</span><span class="c1">#多少轮保存SAVE_ITER = 1000</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">output_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">cfg</span><span class="o">.</span><span class="n">OUTPUT_DIR</span><span class="p">,</span>
        <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="o">.</span><span class="n">now</span><span class="p">()</span><span class="o">.</span><span class="n">strftime</span><span class="p">(</span><span class="s1">'%Y_%m_</span><span class="si">%d</span><span class="s1">_%H_%M'</span><span class="p">))</span><span class="c1">#输出路径</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">exists</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">output_dir</span><span class="p">):</span>
        <span class="n">os</span><span class="o">.</span><span class="n">makedirs</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">output_dir</span><span class="p">)</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">save_cfg</span><span class="p">()</span>

    <span class="c1">#tf.get_variable 和tf.Variable不同的一点是,前者拥有一个变量检查机制,</span>
    <span class="c1">#会检测已经存在的变量是否设置为共享变量,如果已经存在的变量没有设置为共享变量,</span>
    <span class="c1">#TensorFlow 运行到第二个拥有相同名字的变量的时候,就会报错。</span>

self.global_step = tf.get_variable(‘global_step’, [],
initializer=tf.constant_initializer(0), trainable=False)

    <span class="c1">#速率延迟</span>
    <span class="c1">#tf.train.exponential_decay(learning_rate, global_step, </span>
    <span class="c1">#                            decay_steps, decay_rate, staircase=False, name=None)</span>
    <span class="c1">#Applies exponential decay to the learning rate.</span>
    <span class="c1">#decayed_learning_rate = learning_rate *decay_rate ^ (global_step / decay_steps)</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">learning_rate</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">exponential_decay</span><span class="p">(</span>
        <span class="bp">self</span><span class="o">.</span><span class="n">initial_learning_rate</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">global_step</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">decay_steps</span><span class="p">,</span>
        <span class="bp">self</span><span class="o">.</span><span class="n">decay_rate</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">staircase</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s1">'learning_rate'</span><span class="p">)</span>
    
    <span class="c1">#优化器</span>
    <span class="c1">#self.optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.net.loss, global_step=self.global_step)</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">optimizer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">GradientDescentOptimizer</span><span class="p">(</span>
        <span class="n">learning_rate</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">learning_rate</span><span class="p">)</span><span class="o">.</span><span class="n">minimize</span><span class="p">(</span>
          <span class="bp">self</span><span class="o">.</span><span class="n">net</span><span class="o">.</span><span class="n">loss</span><span class="p">,</span><span class="n">global_step</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">global_step</span><span class="p">)</span>
    <span class="c1">#</span>

    
    
    <span class="c1">#指数平均数指标,Some training algorithms, such as GradientDescent and Momentum </span>
    <span class="c1">#often benefit from maintaining a moving average of variables during optimization.</span>
    <span class="c1">#Using the moving averages for evaluations often improve results significantly.</span>
    <span class="c1">#An exponential moving average (EMA) is a type of moving average that is similar to </span>
    <span class="c1">#a simple moving average, except that more weight is given to the latest data</span>
    <span class="c1">#class tf.train.ExponentialMovingAverage</span>
    <span class="c1">#Maintains moving averages of variables by employing and exponential decay.</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">ema</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">ExponentialMovingAverage</span><span class="p">(</span><span class="n">decay</span><span class="o">=</span><span class="mf">0.9999</span><span class="p">)</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">averages_op</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">ema</span><span class="o">.</span><span class="nb">apply</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">trainable_variables</span><span class="p">())</span>
    
    <span class="c1"># Create an op that will update the moving averages after each training</span>
    <span class="c1"># step.  This is what we will use in place of the usual training op.</span>
    <span class="k">with</span> <span class="n">tf</span><span class="o">.</span><span class="n">control_dependencies</span><span class="p">([</span><span class="bp">self</span><span class="o">.</span><span class="n">optimizer</span><span class="p">]):</span>
        <span class="bp">self</span><span class="o">.</span><span class="n">train_op</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">averages_op</span><span class="p">)</span>

    <span class="c1">#summary</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">summary_op</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">merge_all</span><span class="p">()</span>
    <span class="c1">#saver</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">saver</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">train</span><span class="o">.</span><span class="n">Saver</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">net</span><span class="o">.</span><span class="n">collection</span><span class="p">,</span> <span class="n">max_to_keep</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
    <span class="c1">#writer</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">writer</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">summary</span><span class="o">.</span><span class="n">FileWriter</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">output_dir</span><span class="p">,</span> <span class="n">flush_secs</span><span class="o">=</span><span class="mi">60</span><span class="p">)</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">ckpt_file</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">output_dir</span><span class="p">,</span> <span class="s1">'save.ckpt'</span><span class="p">)</span>

    <span class="bp">self</span><span class="o">.</span><span class="n">sess</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span>
    <span class="c1">#如果tensorflow版本未升级,会报错,需要改为tf.initialize_all_variables()</span>
    <span class="c1">#self.sess.run(tf.initialize_all_variables())#初始化</span>
    <span class="bp">self</span><span class="o">.</span><span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">global_variables_initializer</span><span class="p">())</span>

    <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">weights_file</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
        <span class="k">print</span> <span class="s1">'Restoring weights from: '</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">weights_file</span>
        <span class="bp">self</span><span class="o">.</span><span class="n">saver</span><span class="o">.</span><span class="n">restore</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">sess</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">weights_file</span><span class="p">)</span>

    <span class="bp">self</span><span class="o">.</span><span class="n">writer</span><span class="o">.</span><span class="n">add_graph</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">sess</span><span class="o">.</span><span class="n">graph</span><span class="p">)</span></code></pre></div><p>3 test概括</p><p>test.py完成读取训练好的网络权重,检测目标,并画出目标所在位置。代码和训练部分类似,略过。要运行test,首先需要下载文章原作者训练好的模型<a href="https://link.zhihu.com/?target=https%3A//drive.google.com/file/d/0B2JbaJSrWLpza08yS2FSUnV2dlE/view%3Fusp%3Dsharing" class=" wrap external" target="_blank" rel="nofollow noreferrer" data-za-detail-view-id="1043">YOLO_small</a>(貌似需要翻墙)。其次,源代码中有一处小bug,直接运行会报错。</p><div class="highlight"><pre><code class="language-python"><span class="n">net_output</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">sess</span><span class="o">.</span><span class="n">run</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">net</span><span class="o">.</span><span class="n">fc_32</span><span class="p">,</span> <span class="n">feed_dict</span><span class="o">=</span><span class="p">{</span><span class="bp">self</span><span class="o">.</span><span class="n">net</span><span class="o">.</span><span class="n">images</span><span class="p">:</span> <span class="n">inputs</span><span class="p">})</span>

需要改为

net_output = self.sess.run(self.net.fc_32, feed_dict={self.net.x: inputs})

运行结果如图3-1所示,可以看出YOLO能成功识别人和狗,却识别不了马,作者在后续的文章中对YOLO进行了,使之能识别更多的种类,详见YOLO升级版:YOLOv2和YOLO9000解析

图3-1 YOLO检测结果一
或者其它图片,如图3-2所示:

图3-2 YOLO检测结果二

4 总结

YOLO是基于深度学习的端到端的实时目标检测系统,主要的特点是速度非常快,同时还有继续提升精确度的潜力。本文对YOLO的tensorflow实现代码进行了详解,该代码在理解了文章后就很简单。其中涉及到的tensorflow知识有以下几点:

一,tf.get_variable 和tf.Variable的差异。差异点点是,前者拥有一个变量检查机制,会检测已经存在的变量是否设置为共享变量,如果已经存在的变量没有设置为共享变量,TensorFlow 运行到第二个拥有相同名字的变量的时候,就会报错。

二,学习速率延迟的实现。

tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None) #decayed_learning_rate = learning_rate *decay_rate ^ (global_step / decay_steps)

三, 采用指数平均数(exponential moving average (EMA))提高梯度下降(exponential moving average (EMA))训练方法的效果。

self.ema = tf.train.ExponentialMovingAverage(decay=0.9999)
self.averages_op = self.ema.apply(tf.trainable_variables())

with tf.control_dependencies([self.optimizer]):
self.train_op = tf.group(self.averages_op)

四,tf.pack()函数。

tf.pack(values, name=textquotesingle {}packtextquotesingle {})
#该函数的功能等同于np.asarray
tf.pack([x, y, z]) = np.asarray([x, y, z])

五,tf.tile()函数。该函数在某一维度上进行复制。

tf.tile(input, multiples, name=None)
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值