Yolo-V2

最新推荐文章于 2023-10-07 10:34:20 发布

*Major*

最新推荐文章于 2023-10-07 10:34:20 发布

阅读量333

点赞数

本文链接：https://blog.csdn.net/qq_41375318/article/details/106083187

版权

1.一论文导读
2.二论文精读
3.三代码实现
4.四问题思索

Yolo-V2-Model(pytorch版本)

《YOLO9000:Better,Faster,Stronger》
—基于卷积神经网络的目标检测算法
作者：Joseph Redmon ,Ali Farhadi
单位：华盛顿大学
发表会议及时间：IEEE 2016

paper

一论文导读

Abstract

We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don’t have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.

简介

我们介绍YOL09000，一个最先进的，实时目标检测系统，可以检测超过9000个目标类别。

首先，我们提出对YOLO检测方法的各种改进方法，这些改进方法有新颖的也有从以前的工作中得出的。改进的模型
YOLOv2在PASCAL VOC和COCO标准检测任务是目前最好的。

使用一种新颖的多尺度训练方法，相同的YOLOv2模型可以运行在不同的大小的图片上，这个方法提供了速度和精度之间的权衡。在67 FPS时，YOLOv2在VOC 2007. 上获得76.8 mAP。在40 FPS时，YOLOv2获得78.6 mAP,性能优于最先进的
方法，例如使用ResNet的faster RCNN和SSD，同时运行速度明显更快。

最后，我们提出了一种联合训练目标检测和分类的方法。使用这种方法，我们在COC O检测数据集和ImageNet分类数据集.上同时训练YOLO9000。我们的联合训练方法允许YOL09000预测没有标记检测数据的目标类的检测。

我们在ImageNet检测数据集上验证我们的方法。YOLO9000在ImageNet检测验证集上获得19.7mAP，尽管只有200个类中的44类检测数据。不在COCO的156类中，YOLO9000获得16.0mAP。但是YOLO可以检测超过200个类;它预测超过9000个不同目标类别的检测，仍然实时运行。

主要改进

YOLOV1，定位不准确，和基于region proposal的方法相比召回率较低

1、Batch Normalization
设计新网络darknet-19,加入了bn层，收敛更快，相当于加了新的正则，可以去掉之前网络里的dropout层。
最终提高2%map。

V2版本舍弃Dropout，卷积后全部加入Batch Normalization
网络的每一层的输入都做了归一化，收敛相对更容易
经过Batch Normalization处理后的网络会提升2%的mAP
从现在的角度来看，Batch Normalization已经成网络必备处理

2、High Resolution Classifier（高分辨率）
原来的YOLO网络在预训练输入尺寸为224*224，detection的时分辨率变为448*448的输入，分类模型切换到检测模
型的时候，模型还要适应图像分辨率的改变。
YOLOv2则将预训练分成两步: 224*224的输入从头开始训练网络，然后再将输入调整到448*448。最后再在检测的
数据集上fine-tuning。最终提高大约4%map.

3、Convolutional With Anchor Boxes

借鉴了Faster R-CNN的思想，对所有的GroundTruth使用聚类k-means++算法产生anchors。用1-iou作为距离

通过引入anchor boxes，使得预测的box数量更多(1313n)
跟faster-rcnn系列不同的是先验框并不是直接按照长宽固定比给定

我×，iou是什么和什么计算出来的？？？先验眶和真实框？？应该不是，应该就用了真实框，然后自适应出来的
在这里插入图片描述