目录
介绍
此榜单用于评比网上开源的目标检测模型;
模型入选标准
- 以开源代码的精度为准;
- 【数据源一】Paper with Code – SOTA
1. 目标检测
COCO test-dev Benchmark (Object Detection) | Papers With Code
Paper and Codes for COCO (by 2023.3.31)
1. EVA, boxAP: 64.7 (test)
模型: EVA-CMaskRCNN
Note:EVA在进行检测任务时,是基于Cascade Mask R-CNN
的。
1.1* Co-DETR, boxAP: 66.0 (test)
BoxAP-66.0对应的权重尚未公布,我们在其GitHub上咨询了开发者相关情况;
1.2* InternImage-H, boxAP: 65.5
Github-page: OpenGVLab/InternImage
Note:InternImage进行检测任务适配Mask
R-CNN;论文中没有提到“65.5”,而是用“65.4”作为主要贡献点,该结果基于用于DINO实现,不过目前训练配置未在GitHub上公开,(我们已经其repo上提出issue)
1.2* M3I Pre-training, boxAP: 65.4
根据 Papers with Code 的资料显示,这里的“65.4”就是使用了"M3I Pre-training"的“InternImage-H”。
1.3* Co-DETR, boxAP: 64.5
查看其Sense-X/Co-DETR,发现目前代码并未公布;
1.4* Group DETR v2 - pwc, boxAP: 64.5
Group DETR v2 的GitHub主页分数没有这么高,请参见 Group DETR v2 - github;
需要Object365预训练,暂时无法复现
2. FocalNet (DINO), boxAP: 63.5 (github, val)
FocalNet-L-DINO
3. Group DETR v2 - github, boxAP: 63.3 (val)
在其PaddleDetection主分支中,未发现相关代码;
PaddleDetection - 文玉老师:
- 在develop里了 https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/group_detr
- 你clone代码之后checkout到develop就行, 等到release分支里的话得下一次发版
需要等到下一次PaddleDetection发版才可以获得稳定版的代码;
4. Dual-Swin-B-CBNetv2, boxAP: 60.1
模型: HTC-DB-Swin-L (TTA)
4.1* Focal-L, boxAP: 58.9
Github-page: https://github.com/microsoft/Focal-Transformer
在其Github主页上未发现关于COCO数据集的精度数据,最高精度51.2;
(Focal-T-Cascade-Mask-R-CNN精度为51.5,不过会使用mask数据所以没有收录);
4.2* DyHead: 58.7
Github-page: DynamicHead
在其Github主页上最高精度为49.8,暂时不予收录;
5. Swin-L: 58.0(val)
Github-page: Swin-L
在其Github主页上最高精度为58.0(val);
6. YOLOR-D6*: 57.8
Github-page: YOLOR-D6*
7. SOLQ-{Swin-L & 1536}, boxAP: 56.5
模型:SOLQ-{Swin-L & 1536}
8. InternImage-XL, boxAP: 56.2
模型:InternImage-XL–Cascade
9. QueryInst , boxAP: 56.1
模型:QueryInst–Swin_L_300_queries–single_scale_testing
10. RT-DETR-R101, boxAP: 54.3
Note:
- 榜单上收录的模型分数一般是val集的分数,因为val集的结果我们是可以复现的;
COCO FPS Models (by 2023.02.18)
1. YOLOv7, boxAP: 56.8, FPS: 36
模型: YOLOv7-E6E
2. YOLOv5, boxAP: 55.0, FPS: 1e3/26.2=38.2
模型: YOLOv5x6
3. PP-YOLOE+, boxAP: 54.9, FPS: 45.0
模型: PP-YOLOE+_x
4. YOLOv8, boxAP: 53.9, FPS: 283.3
模型: YOLOv8x
4. RTMDet, boxAP: 52.6, FPS: 322.6
模型: RTMDet-x
5. YOLOv6, boxAP: 52.5, FPS: 98
模型: YOLOv6-L
6. PP-YOLOE, boxAP: 52.2, FPS: 95.2
模型: PP-YOLOE-x
7. PP-YOLOv2, boxAP: 50.3, FPS: 49.5
模型: PP-YOLOv2–ResNet101vd
PP-YOLOv2是由Paddle推出的目标检测模型;
8. FastViT-MA36-paper, boxAP: 45.1, FPS: 122.0 (8.2ms)
模型: FastViT-MA36-MaskRCNN
9. NanoDet-Plus-m-1.5x, boxAP: 34.1, FPS: 87.0 (11.50ms)
模型: NanoDet-Plus-m-1.5x
Note:
- 这里的“Real-Time”指的是FPS在30以上的模型;
Look at Batch Size
Model | mAP | FPS |
---|---|---|
YOLOv7 | 51.4 | 161 |
YOLOv7-X | 53.1 | 114 |
YOLOv5n | 28.0 | Q |
YOLOv5s | 37.4 | Q |
YOLOv5m | 45.4 | Q |
YOLOv5l | 49.0 | Q |
YOLOv5x | 50.7 | Q |
YOLOv5n6 | 36.0 | Q |
YOLOv5s6 | 44.8 | Q |
YOLOv5m6 | 51.3 | Q |
YOLOv5l6 | 53.7 | Q |
YOLOv5x6 | 55.0 | Q |
YOLOv5x6+TTA | 55.8 | Q |
PP-YOLOE-s | 43.1 | Q |
PP-YOLOE-m | 48.9 | Q |
PP-YOLOE-l | 51.4 | Q |
🥇PP-YOLOE-x | 52.2 | Q |
PP-YOLOv2-ResNet50vd | 49.5 | 2.67 (32G / 12) |
🥉PP-YOLOv2-ResNet101vd | 50.3 | 2.67 |
NanoDet-Plus-m-320r | 27.0 | 0.25 (24G / 96) |
NanoDet-Plus-m-416r | 30.4 | 0.25 |
NanoDet-Plus-m-1.5x-320r | 29.9 | 0.25 |
NanoDet-Plus-m-1.5x-416r | 34.1 | 0.25 |
NanoDet-m-320r | 20.6 | 0.125 (24G / 192) |
2. 图像分类
评测说明:
- 我们首先参考了 ImageNet Benchmark (Image Classification) | Papers With Code
- ImageNet榜单除了参考paperswithcode之外,还参考了开源项目rwightman /
pytorch-image-models的实验结果 – Results - Pytorch Image Models
ImageNet Leaderboard (by 2021.06.24)
1. SwinTransformer, top1: 87.148
Transformer-based
分类模型:swin_large_patch4_window12_384
2. CaiT-M-48-448, top1: 86.484
Transformer-based
分类模型:cait_m48_448
3. NFNet-F6 , top1: 86.296
分类模型:dm_nfnet_f6
ImageNet FPS Leaderboard (by 2023.04.14)
1.1* Slide-NAT-M - paper, FPS: 998, top1: 82.4%
LeapLabTHU/Slide-Transformer:已有repo,但代码未开源
CNN-based ImageNet Model (by 2021.11.25)
1. MetaPseudoLabels (EfficientNet-L2) , top1: 90.2%
分类模型:MetaPseudoLabels
*. VOLO: transformer-based
2. NFNet-F6+SAM, top1: 86.5%
分类模型:NFNet-F6+SAM
3. ResNeXt101 , top1: 86.26
分类模型:Fix_ResNeXt101_32x48d_wsl_paddle (PaddleClas - EfficientNet and ResNeXt101_wsl series)
4. EfficientNetv2 , top1: 85.49
分类模型:tf_efficientnetv2_l
3. 语义分割
1. DeepLabV3Plus + SDCNetAug, MIoU: 83.5