目录
一、安装配置环境
按照官网的步骤做就行
conda create --name streamyolo python=3.7
conda activate streamyolo
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install yolox==0.3
可以提前下载好项目文件,cd 跳转到项目文件下
ADDPATH=$(pwd)
echo export PYTHONPATH=$PYTHONPATH:$ADDPATH >> ~/.bashrc
source ~/.bashrc
上述指令执行完毕后需要重启环境
conda activtae streamyolo
然后安装mmcv
pip install mmcv-full==1.1.5 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.1/index.html
二、运行 train.py文件遇到的问题
首先将参数 设置 传入到 环境的设置中去,直接在 parameters中 复制即可
-f
cfgs/m_s50_onex_dfp_tal_flip.py
-d
1
-b
4
-c
yolox_s.pth
-o
--fp16
问题一
ImportError: cfgs/m_s50_onex_dfp_tal_flip.py doesn't contains class named 'Exp'
路径出了问题,将 f 参数的路径设置成绝对路径形式,也就是cfgs/m_s50_onex_dfp_tal_flip.py 用绝对路径代替
解决办法:
-f 参数的设置用如下替代
/home/lingyun/models/StreamYOLO-main/cfgs/m_s50_onex_dfp_tal_flip.py
问题二
PermissionError: [Errno 13] Permission denied: '/data'
找到上述调用的 yolox 设置的文件,如cfgs/m_s50_onex_dfp_tal_flip.py,将 Exp 类中的初始化属性 self.ouput_dir 同样用绝对路径代替。(需要事先创建该文件夹)
原
self.output_dir = '/data/output/stream_yolo'
替换为
self.output_dir = '/home/lingyun/models/StreamYOLO-main/data/output/stream_yolo'
问题三
FileNotFoundError: [Errno 2] No such file or directory: 'yolox_s.pth'
同样,将 -c 中的模型路径用绝对路径替换
原
yolox_s.pth
替换为
/home/lingyun/models/StreamYOLO-main/yolox_s.pth
问题四
FileNotFoundError: [Errno 2] No such file or directory: '/data/Argoverse-HD/annotations/train.json'
解决方法
在配置文件 m_s50_onex_dfp_tal_filp.py 中把路径替换成绝对路径。在68行
原
data_dir='/data'
替换成
data_dir='/home/lingyun/models/StreamYOLO-main/data'
问题五
Linux终端直接运行,遇到问题
python tools/train.py -f cfgs/m_s50_onex_dfp_tal_flip.py -d 1 -b 4 -c yolox_s.pth -o --fp16
解决
pip install opencv-python-headless
问题六
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/root/anaconda3/envs/streamyolo/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/root/anaconda3/envs/streamyolo/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/anaconda3/envs/streamyolo/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/root/anaconda3/envs/streamyolo/lib/python3.7/site-packages/yolox/data/datasets/datasets_wrapper.py", line 110, in wrapper
ret_val = getitem_fn(self, index)
File "/root/data/zjx/Code-subject/StreamYOLO-main/exps/data/tal_flip_mosaicdetection.py", line 255, in __getitem__
img, support_img, label, support_label, img_info, id_ = self._dataset.pull_item(idx)
File "/root/data/zjx/Code-subject/StreamYOLO-main/exps/dataset/tal_flip_one_future_argoversedataset.py", line 227, in pull_item
img = self.load_resized_img(index)
File "/root/data/zjx/Code-subject/StreamYOLO-main/exps/dataset/tal_flip_one_future_argoversedataset.py", line 180, in load_resized_img
img = self.load_image(index)
File "/root/data/zjx/Code-subject/StreamYOLO-main/exps/dataset/tal_flip_one_future_argoversedataset.py", line 196, in load_image
assert img is not None
AssertionError
解决办法
参考这里, 也就是数据集的路径不按他的来,按照原来的设置,即
StreamYOLO
├── exps
├── tools
├── yolox
├── data
│ ├── Argoverse-1.1
│ │ ├── tracking
│ │ ├── train
│ │ ├── val
│ │ ├── test
│ ├── Argoverse-HD
│ │ ├── annotations
│ │ ├── test-meta.json
│ │ ├── train.json
│ │ ├── val.json
问题七
FileNotFoundError: [Errno 2] No such file or directory: '/data/Argoverse-HD/annotations/val.json'
Exception ignored in: <function ONE_ARGOVERSEDataset.__del__ at 0x7f7766301710>
Traceback (most recent call last):
File "/root/data/zjx/Code-subject/StreamYOLO-main/exps/dataset/tal_flip_one_future_argoversedataset.py", line 55, in __del__
if self.imgs:
AttributeError: 'ONE_ARGOVERSEDataset' object has no attribute 'imgs'
解决办法
同问题四,在配置文件 m_s50_onex_dfp_tal_filp.py 中把路径替换成绝对路径。在119行
问题八
RuntimeError: CUDA out of memory. Tried to allocate 28.00 MiB (GPU 0; 3.81 GiB total capacity; 2.44 GiB already allocated; 61.00 MiB free; 2.48 GiB reserved in total by PyTorch)
解决方法
将 batch 调小,可以调到1
二、debug记录
1、 args
Namespace(batch_size=4, cache=False, ckpt='/root/data/zjx/Code-subject/StreamYOLO-main/yolox_s.pth', devices=1, dist_backend='nccl', dist_url=None, exp_file='/root/data/zjx/Code-subject/StreamYOLO-main/cfgs/m_s50_onex_dfp_tal_flip.py', experiment_name=None, fp16=True, logger='tensorboard', machine_rank=0, name=None, num_machines=1, occupy=True, opts=[], resume=False, start_epoch=None)
2、optimizer
SGD (
Parameter Group 0
dampening: 0
lr: 0
momentum: 0.9
nesterov: True
weight_decay: 0
Parameter Group 1
dampening: 0
lr: 0
momentum: 0.9
nesterov: True
weight_decay: 0.0005
Parameter Group 2
dampening: 0
lr: 0
momentum: 0.9
nesterov: True
weight_decay: 0
)
3、ckpt
4、self.seq_dirs
5、self._classes
6、im_ann
7、annotations
[{'id': 0, 'image_id': 0, 'bbox': [898, 584, 251, 228], 'category_id': 2, 'area': 57228.0, 'iscrowd': False, 'ignore': False, 'track': 0}, {'id': 1, 'image_id': 0, 'bbox': [459, 425, 408, 362], 'category_id': 5, 'area': 147696.0, 'iscrowd': False, 'ignore': False, 'track': 1}, {'id': 2, 'image_id': 0, 'bbox': [416, 272, 27, 72], 'category_id': 6, 'area': 1944.0, 'iscrowd': False, 'ignore': False, 'track': 2}, {'id': 3, 'image_id': 0, 'bbox': [655, 283, 25, 71], 'category_id': 6, 'area': 1775.0, 'iscrowd': False, 'ignore': False, 'track': 3}, {'id': 4, 'image_id': 0, 'bbox': [211, 395, 20, 52], 'category_id': 6, 'area': 1040.0, 'iscrowd': False, 'ignore': False, 'track': 4}, {'id': 5, 'image_id': 0, 'bbox': [306, 422, 17, 39], 'category_id': 6, 'area': 663.0, 'iscrowd': False, 'ignore': False, 'track': 5}, {'id': 6, 'image_id': 0, 'bbox': [173, 558, 162, 75], 'category_id': 5, 'area': 12150.0, 'iscrowd': False, 'ignore': False, 'track': 6}, {'id': 7, 'image_id': 0, 'bbox': [368, 583, 63, 39], 'category_id': 5, 'area': 2457.0, 'iscrowd': False, 'ignore': False, 'track': 7}, {'id': 8, 'image_id': 0, 'bbox': [4, 597, 93, 40], 'category_id': 2, 'area': 3720.0, 'iscrowd': False, 'ignore': False, 'track': 8}, {'id': 9, 'image_id': 0, 'bbox': [81, 591, 94, 44], 'category_id': 2, 'area': 4136.0, 'iscrowd': False, 'ignore': False, 'track': 9}, {'id': 10, 'image_id': 0, 'bbox': [945, 424, 15, 37], 'category_id': 6, 'area': 555.0, 'iscrowd': False, 'ignore': False, 'track': 10}, {'id': 11, 'image_id': 0, 'bbox': [444, 542, 12, 25], 'category_id': 6, 'area': 300.0, 'iscrowd': False, 'ignore': False, 'track': 11}, {'id': 12, 'image_id': 0, 'bbox': [840, 431, 14, 36], 'category_id': 6, 'area': 504.0, 'iscrowd': False, 'ignore': False, 'track': 12}, {'id': 13, 'image_id': 0, 'bbox': [918, 597, 18, 13], 'category_id': 2, 'area': 234.0, 'iscrowd': False, 'ignore': False, 'track': 13}, {'id': 14, 'image_id': 0, 'bbox': [1149, 499.989847715736, 13, 36], 'category_id': 6, 'area': 468.0, 'iscrowd': False, 'ignore': False, 'track': 14}]
8、obj
9、self.train_loader
10、self.prefetcher
11、inps targets
12、 expanded_strides
13、gt_matched_classes
14、pred_ious_this_matching
15、matched_gt_inds
16、dynamic_ks
17、fg_mask
三、一些入口
1、训练的入口
train.py -- 117
trainer.train()
会跳转到 exps/train_utils/double_trainer.py 中的 Trainer 类中
2、YoloX model 建立
exps/train_utils/double_trainer.py --- 139
model = self.exp.get_model()
3、 模型训练权重参数的加载
exps/train_utils/double_trainer.py --- 149
model = self.resume_train(model)
exps/train_utils/double_trainer.py --- 314
ckpt_file = self.args.ckpt # '/root/data/zjx/Code-subject/StreamYOLO-main/yolox_s.pth'
ckpt = torch.load(ckpt_file, map_location=self.device)["model"]
model = load_ckpt(model, ckpt)
其debug详见 二中的3
4、train_loader 建立
exps/train_utils/double_trainer.py --- 153
self.train_loader = self.exp.get_data_loader(
batch_size=self.args.batch_size,
is_distributed=self.is_distributed,
no_aug=self.no_aug,
cache_img=self.args.cache,
)
然后会跳转到 cfgs/m_s50_onex_dfp_tal_flip.py
5、数据送入模型,前向处理,得到输出
exps/train_utils/double_trainer.py --- 96 始
具体为
outputs = self.model(inps, targets)
6、输入图片的可视化
四、日志打印处
1、
exps/train_utils/double_trainer.py --- 134
2023-01-12 12:28:37 | INFO | exps.train_utils.double_trainer:134 - args: Namespace(batch_size=4, cache=False, ckpt='/root/data/zjx/Code-subject/StreamYOLO-main/yolox_s.pth', devices=1, dist_backend='nccl', dist_url=None, exp_file='/root/data/zjx/Code-subject/StreamYOLO-main/cfgs/m_s50_onex_dfp_tal_flip.py', experiment_name='m_s50_onex_dfp_tal_flip', fp16=True, logger='tensorboard', machine_rank=0, name=None, num_machines=1, occupy=True, opts=[], resume=False, start_epoch=None)
2023-01-12 12:28:59 | INFO | exps.train_utils.double_trainer:135 - exp value:
╒═══════════════════╤════════════════════════════╕
│ keys │ values │
╞═══════════════════╪════════════════════════════╡
│ seed │ None │
├───────────────────┼────────────────────────────┤
│ output_dir │ '/data/output/stream_yolo' │
├───────────────────┼────────────────────────────┤
│ print_interval │ 10 │
├───────────────────┼────────────────────────────┤
│ eval_interval │ 1 │
├───────────────────┼────────────────────────────┤
│ num_classes │ 8 │
├───────────────────┼────────────────────────────┤
│ depth │ 0.67 │
├───────────────────┼────────────────────────────┤
│ width │ 0.75 │
├───────────────────┼────────────────────────────┤
│ act │ 'silu' │
├───────────────────┼────────────────────────────┤
│ data_num_workers │ 6 │
├───────────────────┼────────────────────────────┤
│ input_size │ (600, 960) │
├───────────────────┼────────────────────────────┤
│ multiscale_range │ 5 │
├───────────────────┼────────────────────────────┤
│ data_dir │ None │
├───────────────────┼────────────────────────────┤
│ train_ann │ 'train.json' │
├───────────────────┼────────────────────────────┤
│ val_ann │ 'val.json' │
├───────────────────┼────────────────────────────┤
│ test_ann │ 'instances_test2017.json' │
├───────────────────┼────────────────────────────┤
│ mosaic_prob │ 1.0 │
├───────────────────┼────────────────────────────┤
│ mixup_prob │ 1.0 │
├───────────────────┼────────────────────────────┤
│ hsv_prob │ 1.0 │
├───────────────────┼────────────────────────────┤
│ flip_prob │ 0.5 │
├───────────────────┼────────────────────────────┤
│ degrees │ 10.0 │
├───────────────────┼────────────────────────────┤
│ translate │ 0.1 │
├───────────────────┼────────────────────────────┤
│ mosaic_scale │ (0.1, 2) │
├───────────────────┼────────────────────────────┤
│ enable_mixup │ True │
├───────────────────┼────────────────────────────┤
│ mixup_scale │ (0.5, 1.5) │
├───────────────────┼────────────────────────────┤
│ shear │ 2.0 │
├───────────────────┼────────────────────────────┤
│ warmup_epochs │ 1 │
├───────────────────┼────────────────────────────┤
│ max_epoch │ 15 │
├───────────────────┼────────────────────────────┤
│ warmup_lr │ 0 │
├───────────────────┼────────────────────────────┤
│ min_lr_ratio │ 0.05 │
├───────────────────┼────────────────────────────┤
│ basic_lr_per_img │ 1.5625e-05 │
├───────────────────┼────────────────────────────┤
│ scheduler │ 'yoloxwarmcos' │
├───────────────────┼────────────────────────────┤
│ no_aug_epochs │ 15 │
├───────────────────┼────────────────────────────┤
│ ema │ True │
├───────────────────┼────────────────────────────┤
│ weight_decay │ 0.0005 │
├───────────────────┼────────────────────────────┤
│ momentum │ 0.9 │
├───────────────────┼────────────────────────────┤
│ save_history_ckpt │ True │
├───────────────────┼────────────────────────────┤
│ exp_name │ 'm_s50_onex_dfp_tal_flip' │
├───────────────────┼────────────────────────────┤
│ test_size │ (600, 960) │
├───────────────────┼────────────────────────────┤
│ test_conf │ 0.01 │
├───────────────────┼────────────────────────────┤
│ nmsthre │ 0.65 │
├───────────────────┼────────────────────────────┤
│ random_size │ (50, 70) │
╘═══════════════════╧════════════════════════════╛
2、
exps/train_utils/double_trainer.py --- 312
loading checkpoint for fine tuning
3、
exps/train_utils/double_trainer.py --- 315
多条 warning 日志
2023-01-13 12:14:36 | WARNING | yolox.utils.checkpoint:26 - Shape of backbone.backbone.stem.conv.conv.weight in checkpoint is torch.Size([32, 12, 3, 3]), while shape of backbone.backbone.stem.conv.conv.weight in model is torch.Size([48, 12, 3, 3]).
4、
exps/dataset/tal_flip_one_future_argoversedataset.py --- 36
使用的库中的类 中打印的, 其来自 cfgs/m_s50_onex_dfp_tal_flip.py --- 67
exps.dataset.tal_flip_one_future_argoversedataset:36 - loading annotations into memory...
exps.dataset.tal_flip_one_future_argoversedataset:36 - Done (t=50.09s)
pycocotools.coco:86 - creating index...
pycocotools.coco:86 - index created!
5、
exps/train_utils/double_trainer.py --- 159
exps.train_utils.double_trainer:159 - init prefetcher, this might take one minute or less...
6、
exps/train_utils/double_trainer.py --- 195
exps.train_utils.double_trainer:195 - Training start...
以及模型结构
7、
exps/train_utils/double_trainer.py --- 207
exps.train_utils.double_trainer:207 - ---> start train epoch1
exps.train_utils.double_trainer:210 - --->No mosaic aug now!
exps.train_utils.double_trainer:212 - --->Add additional L1 loss now!
五、model网络结构
1、网络结构
模型从 预训练的 文件中 搭建。
YOLOX(
(backbone): DFPPAFPN(
(backbone): CSPDarknet(
(stem): Focus(
(conv): BaseConv(
(conv): Conv2d(12, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(dark2): Sequential(
(0): BaseConv(
(conv): Conv2d(48, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(96, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(dark3): Sequential(
(0): BaseConv(
(conv): Conv2d(96, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(3): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(4): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(5): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(dark4): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 384, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(3): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(4): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(5): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
(dark5): Sequential(
(0): BaseConv(
(conv): Conv2d(384, 768, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): SPPBottleneck(
(conv1): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): ModuleList(
(0): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False)
(1): MaxPool2d(kernel_size=9, stride=1, padding=4, dilation=1, ceil_mode=False)
(2): MaxPool2d(kernel_size=13, stride=1, padding=6, dilation=1, ceil_mode=False)
)
(conv2): BaseConv(
(conv): Conv2d(1536, 768, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(768, 768, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
)
)
(lateral_conv0): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_p4): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(reduce_conv1): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_p3): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(bu_conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_n3): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(bu_conv1): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(C3_n4): CSPLayer(
(conv1): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv3): BaseConv(
(conv): Conv2d(768, 768, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(m): Sequential(
(0): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Bottleneck(
(conv1): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(conv2): BaseConv(
(conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
)
(jian2): BaseConv(
(conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(jian1): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(jian0): BaseConv(
(conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(head): TALHead(
(cls_convs): ModuleList(
(0): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
(reg_convs): ModuleList(
(0): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(1): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(2): Sequential(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
)
(cls_preds): ModuleList(
(0): Conv2d(192, 8, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(192, 8, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(192, 8, kernel_size=(1, 1), stride=(1, 1))
)
(reg_preds): ModuleList(
(0): Conv2d(192, 4, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(192, 4, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(192, 4, kernel_size=(1, 1), stride=(1, 1))
)
(obj_preds): ModuleList(
(0): Conv2d(192, 1, kernel_size=(1, 1), stride=(1, 1))
(1): Conv2d(192, 1, kernel_size=(1, 1), stride=(1, 1))
(2): Conv2d(192, 1, kernel_size=(1, 1), stride=(1, 1))
)
(stems): ModuleList(
(0): BaseConv(
(conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(1): BaseConv(
(conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
(2): BaseConv(
(conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True)
(act): SiLU(inplace=True)
)
)
(l1_loss): L1Loss()
(bcewithlog_loss): BCEWithLogitsLoss()
(iou_loss): IOUloss()
)
)
2、exp中的内容
╒═══════════════════╤═════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│ keys │ values │
╞═══════════════════╪═════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ seed │ None │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ output_dir │ '/data/output/stream_yolo' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ print_interval │ 10 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ eval_interval │ 1 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ num_classes │ 8 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ depth │ 0.67 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ width │ 0.75 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ act │ 'silu' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ data_num_workers │ 6 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ input_size │ (600, 960) │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ multiscale_range │ 5 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ data_dir │ None │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ train_ann │ 'train.json' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ val_ann │ 'val.json' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ test_ann │ 'instances_test2017.json' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ mosaic_prob │ 1.0 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ mixup_prob │ 1.0 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ hsv_prob │ 1.0 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ flip_prob │ 0.5 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ degrees │ 10.0 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ translate │ 0.1 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ mosaic_scale │ (0.1, 2) │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ enable_mixup │ True │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ mixup_scale │ (0.5, 1.5) │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ shear │ 2.0 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ warmup_epochs │ 1 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ max_epoch │ 15 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ warmup_lr │ 0 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ min_lr_ratio │ 0.05 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ basic_lr_per_img │ 1.5625e-05 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ scheduler │ 'yoloxwarmcos' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ no_aug_epochs │ 15 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ ema │ True │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ weight_decay │ 0.0005 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ momentum │ 0.9 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ save_history_ckpt │ True │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ exp_name │ 'm_s50_onex_dfp_tal_flip' │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ test_size │ (600, 960) │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ test_conf │ 0.01 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ nmsthre │ 0.65 │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ random_size │ (50, 70) │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ model │ YOLOX( │
│ │ (backbone): DFPPAFPN( │
│ │ (backbone): CSPDarknet( │
│ │ (stem): Focus( │
│ │ (conv): BaseConv( │
│ │ (conv): Conv2d(12, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (dark2): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(48, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 48, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 48, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(48, 48, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(48, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(48, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (dark3): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(96, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (2): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (3): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (4): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (5): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (dark4): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 384, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (2): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (3): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (4): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (5): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (dark5): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(384, 768, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): SPPBottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): ModuleList( │
│ │ (0): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False) │
│ │ (1): MaxPool2d(kernel_size=9, stride=1, padding=4, dilation=1, ceil_mode=False) │
│ │ (2): MaxPool2d(kernel_size=13, stride=1, padding=6, dilation=1, ceil_mode=False) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(1536, 768, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (2): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(768, 768, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (lateral_conv0): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (C3_p4): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (reduce_conv1): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (C3_p3): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (bu_conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (C3_n3): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (bu_conv1): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (C3_n4): CSPLayer( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv3): BaseConv( │
│ │ (conv): Conv2d(768, 768, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(768, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (m): Sequential( │
│ │ (0): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Bottleneck( │
│ │ (conv1): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (conv2): BaseConv( │
│ │ (conv): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (jian2): BaseConv( │
│ │ (conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(96, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (jian1): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (jian0): BaseConv( │
│ │ (conv): Conv2d(768, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(384, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (head): TALHead( │
│ │ (cls_convs): ModuleList( │
│ │ (0): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (2): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (reg_convs): ModuleList( │
│ │ (0): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (1): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (2): Sequential( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ ) │
│ │ (cls_preds): ModuleList( │
│ │ (0): Conv2d(192, 8, kernel_size=(1, 1), stride=(1, 1)) │
│ │ (1): Conv2d(192, 8, kernel_size=(1, 1), stride=(1, 1)) │
│ │ (2): Conv2d(192, 8, kernel_size=(1, 1), stride=(1, 1)) │
│ │ ) │
│ │ (reg_preds): ModuleList( │
│ │ (0): Conv2d(192, 4, kernel_size=(1, 1), stride=(1, 1)) │
│ │ (1): Conv2d(192, 4, kernel_size=(1, 1), stride=(1, 1)) │
│ │ (2): Conv2d(192, 4, kernel_size=(1, 1), stride=(1, 1)) │
│ │ ) │
│ │ (obj_preds): ModuleList( │
│ │ (0): Conv2d(192, 1, kernel_size=(1, 1), stride=(1, 1)) │
│ │ (1): Conv2d(192, 1, kernel_size=(1, 1), stride=(1, 1)) │
│ │ (2): Conv2d(192, 1, kernel_size=(1, 1), stride=(1, 1)) │
│ │ ) │
│ │ (stems): ModuleList( │
│ │ (0): BaseConv( │
│ │ (conv): Conv2d(192, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (1): BaseConv( │
│ │ (conv): Conv2d(384, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ (2): BaseConv( │
│ │ (conv): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) │
│ │ (bn): BatchNorm2d(192, eps=0.001, momentum=0.03, affine=True, track_running_stats=True) │
│ │ (act): SiLU(inplace=True) │
│ │ ) │
│ │ ) │
│ │ (l1_loss): L1Loss() │
│ │ (bcewithlog_loss): BCEWithLogitsLoss() │
│ │ (iou_loss): IOUloss() │
│ │ ) │
│ │ ) │
├───────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ optimizer │ SGD ( │
│ │ Parameter Group 0 │
│ │ dampening: 0 │
│ │ lr: 0 │
│ │ momentum: 0.9 │
│ │ nesterov: True │
│ │ weight_decay: 0 │
│ │ │
│ │ Parameter Group 1 │
│ │ dampening: 0 │
│ │ lr: 0 │
│ │ momentum: 0.9 │
│ │ nesterov: True │
│ │ weight_decay: 0.0005 │
│ │ │
│ │ Parameter Group 2 │
│ │ dampening: 0 │
│ │ lr: 0 │
│ │ momentum: 0.9 │
│ │ nesterov: True │
│ │ weight_decay: 0 │
│ │ ) │
╘═══════════════════╧═════════════════════════════════════════════════════════════════════════════════════════════════════════╛
六、train 的主要过程
训练过程 全部都在 exps/train_utils/double_trainer.py 中的 Trainer 类中 完成,
1、self.before_train()
包括建立 dataloader,optimizer,加载模型以及pytorch训练加速等等,训练前的一些准备
2、self.train_in_epoch
1) self.before_epoch
打印了一些关于 epoch 信息的日志,还进行了一些其它的配置,包括 train_loader 的属性和model的某个 head 的是否应用
2) self.train_in_iter
开始的 self.before_iter() 没做什么,主要就是 self.train_one_iter()。执行下面语句时
outputs = self.model(inps, targets)
会跳转到 exps/model/yolox.py 中的 YOLOX 类中,执行前向传播 forward
当执行 backbone 时,
fpn_outs = self.backbone(x, buffer=buffer, mode='off_pipe')
时,会跳转到 exps/model/dfp_pafpn.py 中的 DFFAFPN 类中 的 forward 函数中,最终的流程定义在该类中的 off_forward 函数中,其中前向传播的 pipeline 如下所示
self.backbone -----> CSPDarknet , 这里面包括 stem,dark2,dark3,dark4,dark5
其它的见网络结构图
主要流程如下图所示
最终得到的输出为
outputs = (pan_out2, pan_out1, pan_out0)
其包括fpn三层的输出,接下来会送到head得到网络的预测然后进行损失函数的计算。
当执行 self.head时
loss, iou_loss, conf_loss, cls_loss, l1_loss, num_fg = self.head(
fpn_outs, targets, x
)
会跳转到 exps\model\tal_head.py 文件中 的 TALhead 中的forward函数中,首先会按 layer层(3层) 来分层进行处理,得到预测的输出,每层都会把用于 ref,cls,obj(类似中心质量分支) cat到一起
output = torch.cat([reg_output, obj_output, cls_output], 1)
然后根据这个去建立网格坐标,接下来去计算 loss, 在该类的 get_loss 函数中
对于dataloader 中的源标签 targets 其shape为
{tuple:2} 0=Tensor:(4,120,5) 1=Tensor:(4,120,5)
这里的 tuple为2 是因为 计算loss需要当前帧t和下一帧t+1 图片的ground truth,并且其中的 0 为下一帧图片相关的 gt。 通道数 5 寓意为 0通道为类别标签, 1~4 通道为 bbox 的 label。 至于120 不知道是什么,没搞懂为什么是120。
然后在 self.get_assignments 函数中去 确定正样本。
fg_mask, is_in_boxes_and_center = self.get_in_boxes_info(
gt_bboxes_per_image,
expanded_strides,
x_shifts,
y_shifts,
total_num_anchors,
num_gt,
) # fg_mask 是bool型,确定哪里可能是正样本的, is_in_boxes_and_center 也是bool型,它需要同时满足两种标签 bbox 设立
根据 设立的网格坐标 以及 label 确定 正样本的区域得到 掩码mask, 得到的预测只需要拿出正样本区域 的 数值来计算loss。同理标签也是拿出mask部分。
接下来会计算一个 cost
cost = (
pair_wise_cls_loss
+ 3.0 * pair_wise_ious_loss
+ 100000.0 * (~is_in_boxes_and_center)
)
这个 cost 算是一个 过渡, 它包含了 间的分类损失和 iou损失,用它来就是为了筛选确定正样本的位置。 self.dynamic_k_matching 函数中用到了它。这个函数的处理过程如下所示:
1)对于每个目标 计算的 iou 拿出前10个最大的, 得到topk_ious;
2)对这些iou进行筛选,保留值越接近1越好的,最少保留1个,这里确定的是保留的个数,得到 dynamic_ks ;
3)对计算的 cost ,根据每个目标保留的 iou 个数 dynamic_ks[gt_idx] ,得到前 dynamic_ks[gt_idx] 个最大值的索引 pos_idx。按 cost的第二维度位置;
4)根据上面的索引,使 matching_matrix 对应的位置值为1,其它位置为0,这就是一个掩码mask;
5)计算每个 格子(横向的第二维度,即得分图的每个位置)包含的目标数量anchor_matching_gt ,若数量大于1,保留 cost 值最小的 目标;
6)对 matching_matrix (此时每个格子只包含一个目标)按第一维度加和,取值大于1的位置为 True,也就是 得到 正样本 位置的掩码 fg_mask_inboxes ;
7)根据 掩码 fg_mask_inboxes 计算 正样本的 个数 num_fg ;
8)matched_gt_inds 每个格子的目标索引,第一维度的
9) gt_matched_classes 每个格子对应的 类别标签
10)pred_ious_this_matching 为最终保留的 正样本的iou
最终得到 self.get_assignments 函数的返回值。然后确定标签
cls_target = F.one_hot(
gt_matched_classes.to(torch.int64), self.num_classes
) * pred_ious_this_matching.unsqueeze(-1) # Tensor:(24,8) 转 one_hot 编码,类别标签
obj_target = fg_mask.unsqueeze(-1) # Tensor:(11850,1) obj 标签
reg_target = gt_bboxes_per_image[matched_gt_inds] # Tensor:(24,4) 回归标签
而且还对网络原始的回归预测输出 利用了 l1 loss,设立了 l1 回归损失标签
if self.use_l1: # True 还用了 l1 loss, 对网络的原输出与 标签 计算损失
l1_target = self.get_l1_target(
outputs.new_zeros((num_fg_img, 4)),
gt_bboxes_per_image[matched_gt_inds],
expanded_strides[0][fg_mask],
x_shifts=x_shifts[0][fg_mask],
y_shifts=y_shifts[0][fg_mask],
) # Tensor:(23,4) 计算l1损失的 回归标签
回归损失的计算如下:
先计算 权重这里权重的计算与网络的预测输出没有任何关联,只与 T帧 和 T+1帧图片的g t以及 正样本的数量有关。
回归损失
loss_iou = (
iou_loss_weight * self.iou_loss(bbox_preds.view(-1, 4)[fg_masks], reg_targets)
).sum() / num_fg
obj损失(类似中心质量分数)
loss_obj = (
self.bcewithlog_loss(obj_preds.view(-1, 1), obj_targets)
).sum() / num_fg
分类损失
loss_cls = (
self.bcewithlog_loss(
cls_preds.view(-1, self.num_classes)[fg_masks], cls_targets
)
).sum() / num_fg
额外的 l1损失
if self.use_l1: # True
loss_l1 = (
l1_weight * self.l1_loss(origin_preds.view(-1, 4)[fg_masks], l1_targets)
).sum() / num_fg
这里得到的损失都是一个数值
最后的总损失
loss = reg_weight * loss_iou + loss_obj + loss_cls + loss_l1
然后依次返回到
# yolox.py --- 36
loss, iou_loss, conf_loss, cls_loss, l1_loss, num_fg = self.head(
fpn_outs, targets, x
)
# double_trainer.py --- 124
outputs = self.model(inps, targets)
然后进行一些列的更新操作,包括梯度清零,学习率更新等等。至此 self.train_one_iter() 结束。
self.after_iter
满足训练轮数 对 输入数据 size 进行变化。至此重复执行 train_in_iter
3)self.after_epoch
保存当前epoch的模型
3、self.after_train
打印日志