SwinTransformer搭建及一些问题

最新推荐文章于 2024-05-24 22:50:51 发布

青柠味的脉动

最新推荐文章于 2024-05-24 22:50:51 发布

阅读量1.6k

点赞数 4

本文链接：https://blog.csdn.net/weixin_43904584/article/details/125960449

版权

深度学习专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Swin

装环境

打开Anaconda Powershell Prompt！！！

创建环境：1660TI，cuda11.0，pytorch1.7.1

注意：cmd先nvcc -V看看，是不是cuda11

创建环境

conda create -n swtf python=3.8

激活环境

conda activate swtf

安装pytorch

# 1660Ti
conda install pytorch=1.7.1 torchvision cudatoolkit=11.0 -c pytorch
# 3060
conda install pytorch=1.8.1 torchvision cudatoolkit=11.1 -c pytorch

轮子安装pytorch:先定位到轮子位置

#pip install torch-1.7.0+cu110-cp38-cp38-win_amd64.whl

安装个git方便下代码

conda install git

安装mmcv

建好文件夹，在Anaconda Powershell Prompt切换到那个文件夹再下载

cd F:\Gpytorch\swintransformer

下载mmcv

git clone -b v1.3.1 https://github.com/open-mmlab/mmcv.git

注意：Turn off this advice by setting config variable advice.detachedHead to false，出来个这玩意，我不懂，但是确实下载下来了

cd mmcv
pip install -r requirements.txt

（1） 找到cl.exe所在位置。设置环境变量，添加 C:\Program Files (x86)\Microsoft Visual

Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\Hostx64\x86 到 Path 使得cl.exe可以

在命令窗口中执行

注意：cl.exe所在的具体路径和自己的电脑有关

具体做法：

把上述路径加入Path环境变量，并上移到最顶端

执行立即生效命令 set Path=c

执行 cl

（2） 查找GPU的计算能力from https://developer.nvidia.cn/zh-cn/cuda-gpus

1660TI的算力是7.5；3060的算力是8.6

在Power Shell命令窗口中执行：(此时再mmcv文件夹下)

$env:TORCH_CUDA_ARCH_LIST="7.5"
$env:MMCV_WITH_OPS = 1
$env:MAX_JOBS = 4  
# 编译
python setup.py build_ext
# 安装
python setup.py develop
# 看一下安好没
pip list mmcv

bug1:RuntimeError: Error compiling objects for extension

翻到最上面报错信息，发现是：

subprocess.CalledProcessError: Command '['ninja', '-v', '-j', '4']' returned non-zero exit status 1.

未解决。。。卸了重来就好了。。。环境变量的cuda_path改一下？

安装mmdetation

下载mmdetation

git clone -b v2.11.0 https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop"

安装apex

下载apex

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install

安装swin-trnsformer

下载swin

# 手动下载网址
# https://github.com/SwinTransformer/Swin-Transformer-Object-Detection

git clone https://github.com/SwinTransformer/Swin-Transformer-Object-Detection.git
cd Swin-Transformer-Object-Detection
python setup.py develop

下载预训练权重

# 下载mask_rcnn_swin_tiny_patch4_window7_1x.pth这个

测试demo

# 下面的代码是1行！！！！！
python demo/image_demo.py demo/demo.jpg 
configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py 
mask_rcnn_swin_tiny_patch4_window7_1x.pth

制作数据集

安装labelme3.16.5，这个版本没那么多毛病

pip install labelme==3.16.5

输入labelme打开，再标注，具体百度

制作coco数据集格式

具体百度，制作完目录结构如下：

└── data
 ├── coco
 	├── annotations
 		├── instances_train2017.json
 		└── instances_val2017.json
 	├── train2017
 	└── val2017

将data放入swin-transformer根目录下

训练swin-transformer

创建python文件，命名changemaskrcnn，并将其放入根目录

import torch
pretrained_weights = torch.load('mask_rcnn_swin_tiny_patch4_window7_1x.pth') # 预训练
num_class = 5 # 类别数
pretrained_weights['state_dict']['roi_head.bbox_head.fc_cls.weight'].resize_(num_class+1,1024)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_cls.bias'].resize_(num_class+1)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_reg.weight'].resize_(num_class*4,1024)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_reg.bias'].resize_(num_class*4)
pretrained_weights['state_dict']['roi_head.mask_head.conv_logits.weight'].resize_(num_class,256,1,1)
pretrained_weights['state_dict']['roi_head.mask_head.conv_logits.bias'].resize_(num_class)
torch.save(pretrained_weights, "mask_rcnn_swin_%d.pth"%num_class)

修改配置文件

1）修改changemaskrcnn.py中num_class并执行，产生新的权重文件
权重文件名mask_rcnn_swin_（类别数）.pth
2）修改 configs\_base_\models\mask_rcnn_swin_fpn.py 中num_classes， 共两处
改成类别数
3）修改 configs\_base_\default_runtime.py 中interval, load_from
第一个interval→多少轮保存一次，第二个interval→多少轮显示一次
load_from→"mask_rcnn_swin_（类别数）.pth"
4）修改 configs\swin\mask_rcnn_swin_tiny_patch4_window7_mstrain_480-
800_adamw_1x_coco.py 中的 max_epochs, lr_config
max_epochs→最大轮数，lr_config→学习率
5）修改 configs\_base_\datasets\coco_instance.py 中的
samples_per_gpu=1, # batchsize
workers_per_gpu=0, # GPU个数
6）修改mmdet\datasets\coco.py中的CLASSES

bug2，loss为nan

可以把配置文件4）的use_fp16=True注释掉，大约最后一行。

训练命令

python tools/train.py 
configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py

bug3→史称最大bug！！！

AssertionError: The num_classes (1) in Shared2FCBBoxHead of MMDataParallel。。。

类别输入完了后面加个逗号！！！例如

CLASSES = ('pest',)

bug3.1→最大bug2！！

# 鄙人在做mask_rcnn101时，出现以下错误
File "/content/Swin-Transformer-Object-Detection/mmdet/datasets/coco.py", line 267, in _segm2json
if isinstance(segms[i]['counts'], bytes):
IndexError: list index out of range

解决方式

路径：…/mmdet/models/roi_heads/test_mixins.py

test_mixins.py做了修改，源代码在test_mixins0.py中

118行做了修改：

# supplement_mask = rois[..., -1] == 0
supplement_mask = rois.abs()[..., 1:].sum(dim=-1) == 0

141行做了修改：

# supplement_mask = proposals[i][..., -1] == 0
supplement_mask = rois.abs()[..., 1:].sum(dim=-1) == 0

305行做了修改：

# supplement_mask = det_bbox[..., -1] != 0
supplement_mask = det_bbox.abs().sum(dim=-1) != 0

bug4→存不下

在这里插入图片描述
到checkpoint.py注释掉出错的一行

bug5→验证未完成显存炸了

在这里插入图片描述

这是因为每次训练后默认预测

configs/_base_/datasets/coco_instance.py 最后一行：
evaluation = dict(metric=['bbox', 'segm'], interval=50) # 训练50轮在测试

测试命令

python demo/image_demo.py (图片路径) configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco/latest.pth

bug6→测试时显存爆了→RuntimeError: CUDA out of memory

我显存6G确实不够，那就在cpu跑测试

python demo/image_demo.py (图片路径) configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco/latest.pth --device cpu

性能统计

python tools/test.py 
configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco/latest.pth 
--eval bbox segm

bug7→eval时显存爆了→RuntimeError: CUDA out of memory

(21条消息) 【pytorch】mmdetection 做eval / test时弹出OOM（Out of Memory / CUDA out of memory）的解决过程记录_煎pan上的狸猫的博客-CSDN博客

https://blog.csdn.net/weixin_42362903/article/details/122750216 大神给出了解决方案

我的按照大神的思路是移到cpu进行评估
在这里插入图片描述
找到文件所在位置进行更改

# 评估yolact时
# mmdet/models/dense_heads/yolact_head.py中大概854行也加了个.cpu（）
        mask_pred = F.interpolate(
            mask_pred.unsqueeze(0).cpu(), (img_h, img_w), # 加了个.cpu()
            mode='bilinear',
            align_corners=False).squeeze(0) > 0.5
        mask_pred = mask_pred.cpu().numpy().astype(np.uint8)

更改IOU为0.5进行评估

# mmdet/datasets/coco.py
ctrl+F搜索evaluate函数，iou_thrs=None→iou_thrs=[0.5]

日志分析

python tools/analysis_tools/analyze_logs.py plot_curve work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480800_adamw_1x_coco/（自己的日志文件，如20211226_093405.log.json）

bug7 →ModuleNotFoundError: No module named ‘seaborn’

那就安装一下

conda install seaborn

要是还出bug，那就自己写一个，参考以下这位大神：

(21条消息) mmdet-tools工具测试_知识在于分享的博客-CSDN博客

https://blog.csdn.net/baidu_40840693/article/details/120326501

代码：

# analyze_logs.py
import argparse
import json
from collections import defaultdict
 
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
 
 
def cal_train_time(log_dicts, args):
    for i, log_dict in enumerate(log_dicts):
        print(f'{"-" * 5}Analyze train time of {args.json_logs[i]}{"-" * 5}')
        all_times = []
        for epoch in log_dict.keys():
            if args.include_outliers:
                all_times.append(log_dict[epoch]['time'])
            else:
                all_times.append(log_dict[epoch]['time'][1:])
        all_times = np.array(all_times)
        epoch_ave_time = all_times.mean(-1)
        slowest_epoch = epoch_ave_time.argmax()
        fastest_epoch = epoch_ave_time.argmin()
        std_over_epoch = epoch_ave_time.std()
        print(f'slowest epoch {slowest_epoch + 1}, '
              f'average time is {epoch_ave_time[slowest_epoch]:.4f}')
        print(f'fastest epoch {fastest_epoch + 1}, '
              f'average time is {epoch_ave_time[fastest_epoch]:.4f}')
        print(f'time std over epochs is {std_over_epoch:.4f}')
        print(f'average iter time: {np.mean(all_times):.4f} s/iter')
        print()
 
 
def plot_curve(log_dicts, args):
    if args.backend is not None:
        plt.switch_backend(args.backend)
    sns.set_style(args.style)
    # if legend is None, use {filename}_{key} as legend
    legend = args.legend
    if legend is None:
        legend = []
        for json_log in args.json_logs:
            for metric in args.keys:
                legend.append(f'{json_log}_{metric}')
    assert len(legend) == (len(args.json_logs) * len(args.keys))
    metrics = args.keys
 
    num_metrics = len(metrics)
    for i, log_dict in enumerate(log_dicts):
        epochs = list(log_dict.keys())
        for j, metric in enumerate(metrics):
            print(f'plot curve of {args.json_logs[i]}, metric is {metric}')
            if metric not in log_dict[epochs[0]]:
                raise KeyError(
                    f'{args.json_logs[i]} does not contain metric {metric}')
 
            if 'mAP' in metric:
                xs = np.arange(1, max(epochs) + 1)
                ys = []
                for epoch in epochs:
                    ys += log_dict[epoch][metric]
                ax = plt.gca()
                ax.set_xticks(xs)
                plt.xlabel('epoch')
                plt.plot(xs, ys, label=legend[i * num_metrics + j], marker='o')
            else:
                xs = []
                ys = []
                num_iters_per_epoch = log_dict[epochs[0]]['iter'][-1]
                for epoch in epochs:
                    iters = log_dict[epoch]['iter']
                    if log_dict[epoch]['mode'][-1] == 'val':
                        iters = iters[:-1]
                    xs.append(
                        np.array(iters) + (epoch - 1) * num_iters_per_epoch)
                    ys.append(np.array(log_dict[epoch][metric][:len(iters)]))
                xs = np.concatenate(xs)
                ys = np.concatenate(ys)
                plt.xlabel('iter')
                plt.plot(
                    xs, ys, label=legend[i * num_metrics + j], linewidth=0.5)
            plt.legend()
        if args.title is not None:
            plt.title(args.title)
    if args.out is None:
        plt.show()
    else:
        print(f'save curve to: {args.out}')
        plt.savefig(args.out)
        plt.cla()
 
 
def add_plot_parser(subparsers):
    parser_plt = subparsers.add_parser(
        'plot_curve', help='parser for plotting curves')
    parser_plt.add_argument(
        'json_logs',
        type=str,
        nargs='+',
        help='path of train log in json format')
    parser_plt.add_argument(
        '--keys',
        type=str,
        nargs='+',
        default=['bbox_mAP'],
        help='the metric that you want to plot')
    parser_plt.add_argument('--title', type=str, help='title of figure')
    parser_plt.add_argument(
        '--legend',
        type=str,
        nargs='+',
        default=None,
        help='legend of each plot')
    parser_plt.add_argument(
        '--backend', type=str, default=None, help='backend of plt')
    parser_plt.add_argument(
        '--style', type=str, default='dark', help='style of plt')
    parser_plt.add_argument('--out', type=str, default=None)
 
 
def add_time_parser(subparsers):
    parser_time = subparsers.add_parser(
        'cal_train_time',
        help='parser for computing the average time per training iteration')
    parser_time.add_argument(
        'json_logs',
        type=str,
        nargs='+',
        help='path of train log in json format')
    parser_time.add_argument(
        '--include-outliers',
        action='store_true',
        help='include the first value of every epoch when computing '
        'the average time')
 
 
def parse_args():
    parser = argparse.ArgumentParser(description='Analyze Json Log')
    # currently only support plot curve and calculate average train time
    subparsers = parser.add_subparsers(dest='task', help='task parser')
    add_plot_parser(subparsers)
    add_time_parser(subparsers)
    args = parser.parse_args()
    return args
 
 
def load_json_logs(json_logs):
    # load and convert json_logs to log_dict, key is epoch, value is a sub dict
    # keys of sub dict is different metrics, e.g. memory, bbox_mAP
    # value of sub dict is a list of corresponding values of all iterations
    log_dicts = [dict() for _ in json_logs]
    for json_log, log_dict in zip(json_logs, log_dicts):
        with open(json_log, 'r') as log_file:
            for line in log_file:
                log = json.loads(line.strip())
                # skip lines without `epoch` field
                if 'epoch' not in log:
                    continue
                epoch = log.pop('epoch')
                if epoch not in log_dict:
                    log_dict[epoch] = defaultdict(list)
                for k, v in log.items():
                    log_dict[epoch][k].append(v)
    return log_dicts
 
 
def main():
    args = parse_args()
 
    json_logs = args.json_logs
    for json_log in json_logs:
        assert json_log.endswith('.json')
 
    log_dicts = load_json_logs(json_logs)
 
    eval(args.task)(log_dicts, args)
 
 
if __name__ == '__main__':
    main()

使用：

# 看一下
python analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --legend loss_cls loss_bbox
# 存一下
python analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf