专栏链接:
https://blog.csdn.net/qq_39707285/article/details/124005405
此专栏主要总结深度学习中的知识点,从各大数据集比赛开始,介绍历年冠军算法;同时总结深度学习中重要的知识点,包括损失函数、优化器、各种经典算法、各种算法的优化策略Bag of Freebies (BoF)等。
专栏链接:
https://blog.csdn.net/qq_39707285/category_11814303.html
此专栏主要介绍RNN、LSTM、Attention、Transformer及其代码实现。
专栏链接:
https://blog.csdn.net/qq_39707285/category_12009356.html
此专栏详细介绍YOLO系列算法,包括官方的YOLOv1、YOLOv2、YOLOv3、YOLOv4、Scaled-YOLOv4、YOLOv7,和YOLOv5,以及美团的YOLOv6,还有PaddlePaddle的PP-YOLO、PP-YOLOv2等,还有YOLOR、YOLOX、YOLOS等。
专栏链接:
https://blog.csdn.net/qq_39707285/category_12184436.html
此专栏详细介绍各种Visual Transformer,包括应用到分类、检测和分割的多种算法。
1. 用于分类的Transformer
表1 Visual Transformer在ImageNet-1k、CIFAR-10和CIFAR-100数据集上TOP-1准确率对比。 “1k only”表示仅在ImageNet-1K数据集上进行训练; “21k pre-train”表示在ImageNet-21k数据集上进行预训练,然后再ImageNet-1k上进行微调; “Distill” 表示应用DEIT蒸馏训练方案 | ||||||||||||
Method | Type | Epochs | Batch Size | #Params. (M) | FLOPs (G) | Training Scheme | Image Size | ImageNet-1k Top-1 Acc. | CIFAR Top-1 Acc. | |||
Train | Test | 1k only | 21k pre-train. / Distill.Υ | CIFAR 10 | CIFAR 100 | |||||||
ViT-B/16↑ ViT-L/16↑ | OVT | 300 | 4096 | 86 307 | 743 5172 | ViT | 224 224 | 384 384 | 77.9 76.5 | 83.97 85.15 | 98.1 97.9 | 87.1 86.4 |
VT-ResNet18 VT-ResNet34 VT-ResNet50 VT-ResNet101 | TEC | 90 | 256 | 11.7 19.2 21.4 41.5 | 1.569 3.236 3.412 7.129 | - | 224 224 224 224 | 224 224 224 224 | 76.8 79.9 80.6 82.3 | - - - - | - - - - | - - - - |
BoTNet-S1-59-T2 BoTNet-S1-110-T4 BoTNet-S1-128-T5↑ | TEC | 350 | 4096 | 33.5 54.7 75.1 | 7.3 10.9 19.3 | - | 224 224 224 | 224 224 256 | 81.7 82.8 83.5 | - - - | - - - | - - - |
DeiT-Ti DeiT-S DeiT-B DeiT-B↑ | CET | 300 | 1024 | 5.7 22.1 86.6 86.6 | 1.3 4.6 17.6 52.8 | DeiT | 224 224 224 224 | 224 224 224 384 | 72.2 79.8 81.8 83.1 | 74.5Υ 81.2Υ 83.4Υ 84.5Υ | - - 99.1 99.2 | - - 90.8 91.4 |
ConViT-Ti ConViT-S ConViT-B | CET | 300 | 512 | 6 27 86 | 1 5.4 17 | DeiT | 224 224 224 | 224 224 224 | 73.1 81.3 82.4 | - - - | - - - | - - - |
LocalViT-T LocalViT-S | CET | 300 | 1024 | 5.9 22.4 | 1.3 4.6 | DeiT | 224 224 | 224 224 | 74.8 80.8 | - - | - - | - - |
CeiT-T CeiT-S CeiT-T↑ CeiT-S↑ | CET | 300 | 1024 | 6.4 24.2 6.4 24.2 | 1.2 4.5 3.6 12.9 | DeiT | 224 224 224 224 | 224 224 384 384 | 76.4 82 78.8 83.3 | - - - - | 98.5 99 98.5 99.1 | 88.4 90.8 88 90.8 |
ResT-Small ResT-Base ResT-Large | CET | 300 | 2048 | 13.66 30.28 51.63 | 1.9 4.3 7.9 | DeiT | 224 224 224 | 224 224 224 | 79.6 81.6 83.6 | - - - | - - - | - - - |
ViTC-1GF ViTC-4GF ViTC-18GF ViTC-36GF | CET | 400 | 2048 2048 1024 512 | 4.6 17.8 81.6 167.8 | 1.1 4 17.7 35 | DeiT , PVT | 224 224 224 224 | 224 224 224 224 | 75.3 81.4 83 84.2 | - 81.2 84.9 85.8 | - - - - | - - - - |
CoAtNet-0 CoAtNet-1 CoAtNet-2 CoAtNet-3 CoAtNet-4-E150↑ | CET | 300/90 | 4096 | 25 42 75 168 275 | 4.2 8.4 15.7 34.7 189.5 | - | 224 224 224 224 224 | 224 224 224 224 384 | 81.6 83.3 84.1 84.5 - | - - 87.1 87.6 88.4 | - - - - - | - - - - - |
TNT-S TNT-B TNT-S↑ TNT-B↑ | TET | 300 | 1024 | 23.8 65.6 23.8 65.6 | 5.2 14.1 - - | DeiT | 224 224 224 224 | 224 224 384 384 | 81.3 82.8 83.1 83.9 | - - - - | - - 98.7 99.1 | - - 90.1 91.1 |
Swin-T Swin-S Swin-B Swin-B↑ Swin-L↑ | TET | 300/60 | 1024/4096 | 29 50 88 88 197 | 4.5 8.7 15.4 47 103.9 | DeiT | 224 224 224 224 224 | 224 224 224 384 384 | 81.3 83 83.3 84.2 - | - - 85.2 86.0 86.4 | - - - - - | - - - - - |
VOLO-D1 VOLO-D2 VOLO-D3 VOLO-D4 VOLO-D5 VOLO-D3↑ VOLO-D4↑ VOLO-D5↑ | TET | 300 | 1024 | 27 59 86 193 296 86 193 296 | 6.8 14.1 20.6 43.8 69 67.9 197 304 | LV-ViT | 224 224 224 224 224 224 224 224 | 224 224 224 224 224 448 448 448 | 84.2 85.2 85.4 85.7 86.1 86.3 86.8 87 | - - - - - - - | - - - - - - - | - - - - - - - |
T2T-ViT-14 T2T-ViT-19 | TET | 310 | 1024 | 21.5 39.2 | 5.2 8.9 | - | 224 224 | 224 224 | 81.5 81.9 | - - | 97.5 98.3 | 88.4 89 |
PVT-Tiny PVT-Small PVT-Medium PVT-Large | HT | 300 | 128 | 13.2 24.5 44.1 61.4 | 1.9 3.8 6.7 9.8 | DeiT | 224 224 224 224 | 224 224 224 224 | 75.1 79.8 81.2 81.7 | - - - - | - - - - | - - - - |
PiT-Ti PiT-XS PiT-S PiT-B | HT | 300 | 1024 | 4.9 10.6 23.5 73.8 | 0.7 1.4 2.9 12.5 | DeiT | 224 224 224 224 | 224 224 224 224 | 73 78.1 80.9 82 | 74.6Υ 79.1Υ 81.9Υ 84Υ | - - - - | - - - - |
CvT-13 CvT-21 CvT-13↑ CvT-21↑ CvT-W24↑ | HT | 300 | 2048 | 20 32 20 32 277 | 4.5 7.1 16.3 24.9 193.2 | ViT , BiT | 224 224 224 224 224 | 224 224 384 384 384 | 81.6 82.5 83 83.3 - | - - 83.3 84.9 87.7 | - - - - - | - - - - - |
DeepViT-S DeepViT-L | DT | 300 | 256 | 27 55 | 6.2 12.5 | DeiT , ResNest | 224 224 | 224 224 | 82.3 83.1 | - - | - - | - - |
CaiT-XS-24 CaiT-S-24 CaiT-S-36 CaiT-M-24 CaiT-M-36 | DT | 400 | 1024 | 26.6 46.9 68.2 185.9 270.9 | 5.4 9.4 13.9 36 53.7 | DeiT | 224 224 224 224 224 | 224 224 224 224 224 | 81.8 82.7 83.3 83.4 83.8 | 82.0Υ 83.5Υ 84Υ 84.7Υ 85.1Υ | - - 99.2 - 99.3 | - - 92.2 - 93.3 |
DiversePatch-S12 DiversePatch-S24 DiversePatch-B12 DiversePatch-B24 DiversePatch-B12↑ | DT | 400 | 1024 | 22 44 86 172 86 | - - - - - | DeiT | 224 224 224 224 224 | 224 224 224 224 384 | 81.2 82.2 82.9 83.3 84.2 | - - - - - | - - - - - | - - - - - |
Refined-ViT-S Refined-ViT-M Refined-ViT-L Refined-ViT-M↑ Refined-ViT-L↑ | DT | 300 | 256 512 | 25 55 81 55 81 | 7.2 13.5 19.1 49.2 69.1 | DeiT | 224 224 224 224 224 | 224 224 224 384 384 | 83.6 84.6 84.9 85.6 85.7 | - - - - - | - - - - - | - - - - - |
CrossViT-9 CrossViT-15 CrossViT-18 CrossViT-18* CrossViT-15*↑ CrossViT-18*↑ | M | 300 | 4096 | 8.6 27.4 43.3 44.3 28.5 44.6 | 1.8 5.8 9.03 9.5 21.4 32.4 | DeiT | 224 224 224 224 224 224 | 224 224 224 224 384 384 | 73.9 81.5 82.5 82.8 83.5 83.9 | - - - - - - | - 99 99.11 - - - | - 90.77 91.36 - - - |
LV-ViT-S LV-ViT-M LV-ViT-L LV-ViT-M↑ LV-ViT-L↑ | DAT | 300 | 1024 | 26 56 150 56 150 | 6.6 16 59 42.2 157.2 | LV-ViT | 224 224 288 224 288 | 224 224 288 384 448 | 83.3 84 85.3 85.4 85.9 | - - - - - | - - - - - | - - - - - |