YOLOV4

论文:YOLOv4: Optimal Speed and Accuracy of Object Detection

Github:https://github.com/AlexeyAB/darknet

 

论文主要针对各种可以提升精度的trick进行了整合,加入YOLOV3中,得到最终本文的YOLOV4。最终在coco上面达到了43.5%AP ,在Tesla V100 上达到了65FPS。性能+精度,好到爆炸。

 

主要贡献:

  1. 提出了一个高效快速的目标检测框架YOLOV4
  2. 分析验证了Bag-ofFreebies 和Bag-of-Specials 方法对检测框架训练和推理的影响。
  3. 更改方法,使得YOLOV4可以适应于单GPU训练,大大降低YOLOV4的训练门槛。

 

检测框架对比:

输入Input:

 Image, Patches, Image Pyramid

骨架网络Backbones:

VGG16 , ResNet-50 , SpineNet, EfficientNet-B0/B7 , CSPResNeXt50 ,CSPDarknet53

颈部模块Neck:
• Additional blocks: SPP , ASPP, RFB, SAM
• Path-aggregation blocks: FPN , PAN ,NAS-FPN , Fully-connected FPN, BiFPN, ASFF , SFAM  

头部模块Heads:
• Dense Prediction (one-stage):RPN , SSD , YOLO , RetinaNet (anchor based), CornerNet, CenterNet , MatrixNet, FCOS (anchor free)
• Sparse Prediction (two-stage):Faster R-CNN , R-FCN , Mask RCNN (anchor based)RepPoints (anchor free)

 

Bag of freebies :

仅仅改变训练策略,并且只增加训练的开销,不增加推理测试的开销的改进,称为Bag of freebies。

We call these methods that only change the training strategy or only increase the training cost as “bag of freebies.”

用到的改进包括,

(1)数据增强data augmentation

brightness ,contrast ,hue ,saturation ,noise ,random scaling,cropping,flipping ,rotating ,CutOut, MixUp, CutMix

(2)正则化方法

DropOut, DropPath ,Spatial DropOut , or DropBlock

(3)难例挖掘

hard negative example mining ,online hard example mining ,focal loss ,label smoothing

(4)损失函数

MSE, IoU, GIoU, CIoU, DIoU

 

Bag of specials:

只通过增加很小的计算量就可以极大的提高模型精度的方法,称为Bag of specials。

For those plugin modules and post-processing methods that only increase the inference cost by a small amount but can significantly improve the accuracy of object detection, we call them “bag of specials”.

用到的改进包括,

(1)增大感受野

SPP , ASPP, RFB , Spatial Pyramid Matching (SPM)

(2)attention方法

Squeeze-and-Excitation (SE), Spatial Attention Module (SAM)

(3)跳跃连接:

Residual connections, Weighted residual connections, Multi-input weighted residual connections,  Cross stage partial connections (CSP) , FPN ,SFAM  ,ASFF  ,BiFPN  

(4)激活函数:

ReLU, leaky-ReLU, parametric-ReLU, ReLU6, SELU, Swish, Mish 

(5)NMS

greedy NMS, soft NMS

(6)归一化方法:

Batch Normalization (BN) ,Cross-GPU Batch Normalization (CGBN or SyncBN), Filter Response Normalization (FRN) , Cross-Iteration Batch Normalization (CBN)

 

网络基础结构的选择:

CSPDarknet53比CSPResNext50 ,EfficientNet-B3具有更大的感受野,更快的速度,因此,选择CSPDarknet53作为YOLOV4的基础骨架。

 

分类精度高的模型不一定检测精度也高,

A reference model which is optimal for classification is not always optimal for a detector.

检测需要的条件,

  1. 更高的输入图片分辨率,有助于检测多尺度的小物体
  2. 更多的层,可以匹配更大的网络输入
  3. 更多的参数,使得模型有更大的包容力检测不同大小的物体

 

 

实验结果:

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值