yolov2训练_论文阅读:YOLOv2:YOLO9000: Better, Faster, Stronger

1、论文总述

2、location prediction

3、尺寸26的feature map转到13

4、性能比较

5、Darknet-19

6、YOLO9000 Best and Worst Classes on ImageNet

7、YOLOv2的损失函数

参考文献

1、论文总述

0ca9e22f33ed2d2738cd223c52a9139b.png

YOLOv2是在YOLO基础上改进的版本,感觉焕然一新了,加了许多其他网络的比较好使的模块,如BN层、FasterRcnn中的RPN(yolo 用了anchor之后map虽然下降一点,但召回率提升很多)、ssd中的多尺度特征融合;然后还有一些自己提出的比较新颖的模块,如hi-res classifier模块(高分辨率分类器之后转到检测器)、新的backbone(DarkNet19既快又准确率高)、anchor的长宽比使用聚类分析得到、位置预测时预测中心点在这个grid cell里的偏移(0到1之间)、多尺度训练(测试时候只用这个多尺度训练的模型就可以测试多个尺度)。

还有很重要的YOLO9000模型,在YOLOV2的基础上改进得到,模型结构上没啥变化,就是将数据多的分类数据集与数据少的检测数据集在训练时候融合到一起训练(造出了WordNet解决了类别不是完全互斥的问题),是分类样本就只返回分类loss,若是检测样本就分类loss和回归loss一起回传(BP)。

We propose a new method to harness the large amount
of classification data we already have and use it to expand
the scope of current detection systems. Our method uses a hierarchical view of object classification that allows us to
combine distinct datasets together.
We also propose a joint training algorithm that allows
us to train object detectors on both detection and classification data.
Our method leverages labeled detection images to
learn to precisely localize objects while it uses classification
images to increase its vocabulary and robustness.
Using this method we train YOLO9000, a real-time object detector that can detect over 9000 different object categories. First we improve upon the base YOLO detection
system to produce YOLOv2, a state-of-the-art, real-time
detector. Then we use our dataset combination method
and joint training algorithm to train a model on more than
9000 classes from ImageNet as well as detection data from
COCO.

2、location prediction

The network predicts 5 bounding boxes at each cell in the output
feature map. The network predicts 5 coordinates for each bounding box,
tx, ty, tw, th, and to. If the cell is offset from the top left corner
of the image by (cx, cy) and the bounding box prior has width and
height pw, ph, then the predictions correspond to: bx = σ(tx) + cx by
= σ(ty) + cy bw = pwetw ;bh = pheth; P r(object) ∗ IOU(b, object) = σ(to)

aeaf0b460b856f0e5209e3ecf30f1db4.png

3、尺寸26的feature map转到13

87ce39d23c9490023c5ad71f941b7010.png

4、性能比较

8cfc18872ab28c9f6a5dc1064264832d.png

5、Darknet-19

138f4fe1bb0c2f60de4ae938585619fe.png
We propose a new classification model to
be used as the base of YOLOv2. Our model builds off of
prior work on network design as well as common knowledge in the field. Similar to the VGG models we use mostly
3 × 3 filters and double the number of channels after every pooling step [17]. Following the work on Network in
Network (NIN) we use global average pooling to make predictions as well as 1 × 1 filters to compress the feature representation between 3 × 3 convolutions [9]. We use batch normalization to stabilize training, speed up convergence,
and regularize the model [7].
Our final model, called Darknet-19, has 19 convolutional
layers and 5 maxpooling layers.
Darknet-19 only requires 5.58 billion operations
to process an image yet achieves 72.9% top-1 accuracy and
91.2% top-5 accuracy on ImageNet (快很准)

6、YOLO9000 Best and Worst Classes on ImageNet

6fa5a387bca89d7fc3eec73b6cf0f4c2.png

7、YOLOv2的损失函数

YOLOv2的损失函数论文里没有直接给,网上这方面的资料也比较少,有几篇博客里说了一点,也都是在源码里总结出的loss函数。

4bdeddf682c1fd714ddc4d32de64af2d.png

这个损失函数讲解链接:https://zhuanlan.zhihu.com/p/35325884

参考文献

1、目标检测|YOLOv2原理与实现(附YOLOv3)

2、YOLO v2算法详解

3、重温yolo v2

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值