【 ICCV代码复现】Swin Transformer图像分类实战教程 (训练自己的数据集）

最新推荐文章于 2025-05-12 13:57:31 发布

s石有八九

最新推荐文章于 2025-05-12 13:57:31 发布

阅读量5.2k

点赞数 35

分类专栏：深度学习代码实战文章标签： transformer 分类深度学习人工智能算法

本文链接：https://blog.csdn.net/weixin_62371528/article/details/137112837

版权

深度学习代码实战专栏收录该内容

6 篇文章

订阅专栏

本文详细介绍了如何在SwinTransformer中进行图像分类，包括环境配置（如使用pytorch和mmcv，以及CUDA和PyTorch版本），修改config.py、build.py和utils.py中的参数，以及训练和评估过程。还提供了处理常见错误如TypeError:init()gotanunexpectedkeywordargumentt_mul的方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Swin Transformer图像分类实战教程

一、环境配置
- 1.官方环境配置
- 2.数据集结构
二、修改配置等文件
三、训练
- 1.Train
- 2.Evaluation
四、常见报错
- 1.TypeError: __init__() got an unexpected keyword argument ‘t_mul‘

我用的是官方的代码，还有一位大神的集成代码也很不错，根据自己需求选择（不过选择大神的代码就不能看我这个教程了）https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_classification/swin_transformer

论文地址：https://arxiv.org/pdf/2103.14030.pdf
GitHub地址：https://github.com/microsoft/Swin-Transformer/tree/main
在这里插入图片描述

一、环境配置

1.官方环境配置

基础pytorch、mmcv等，可以按照官方的教程如以下信息：
https://github.com/microsoft/Swin-Transformer/blob/main/get_started.md

我们推荐使用 pytorch docker nvcr>=21.05 by nvidia:
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch
Clone this repo:

git clone https://github.com/microsoft/Swin-Transformer.git
cd Swin-Transformer

创建conda虚拟环境并激活:

conda create -n swin python=3.7 -y
conda activate swin

Install CUDA>=10.2 with cudnn>=7 following the official installation instructions
Install PyTorch>=1.8.0 and torchvision>=0.9.0 with CUDA>=10.2:

conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch

Install timm==0.4.12:

pip install timm==0.4.12

安装其他环境:

pip install opencv-python==4.4.0.46 termcolor==1.1.0 yacs==0.1.8 pyyaml scipy

Install fused window process for acceleration, activated by passing --fused_window_process in the running script

cd kernels/window_process
python setup.py install #--user

2.数据集结构

$ tree data
imagenet
├── train
│   ├── class1
│   │   ├── img1.jpeg
│   │   ├── img2.jpeg
│   │   └── ...
│   ├── class2
│   │   ├── img3.jpeg
│   │   └── ...
│   └── ...
└── val
    ├── class1
    │   ├── img4.jpeg
    │   ├── img5.jpeg
    │   └── ...
    ├── class2
    │   ├── img6.jpeg
    │   └── ...
    └── ...

二、修改配置等文件

1.修改config.py

_C.DATA.DATA_PATH = ‘dataset’
数据集路径的根目录，我定义为dataset，将数据集放在dataset里

_C.DATA.DATASET = ‘imagenet’
数据集的类型，这里只有一种类型imagenet

_C.MODEL.NUM_CLASSES：模型的类别，默认是1000，按照数据集的类别数量修改。

_C.SAVE_FREQ = 10 ，每多少个epoch保存一次模型

_C.TRAIN.EPOCHS = 300
训练300轮

2.修改build.py

找到mixup部分，将nb_classes =1000改为nb_classes = config.MODEL.NUM_CLASSES
修改完像下面这样
在这里插入图片描述

3.修改utils.py

找到load_checkpoint函数
在checkpoint = torch.load(config.MODEL.RESUME, map_location='cpu')后面插入

    if checkpoint['model']['head.weight'].shape[0] == 1000:
        checkpoint['model']['head.weight'] = torch.nn.Parameter(
            torch.nn.init.xavier_uniform(torch.empty(config.MODEL.NUM_CLASSES, 768)))
        checkpoint['model']['head.bias'] = torch.nn.Parameter(torch.randn(config.MODELNUM_CLASSES))

修改完如下所示
在这里插入图片描述

三、训练

1.Train

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use> --master_port 12345  main.py \ 
--cfg <config-file> --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]

For example, to train Swin Transformer with 8 GPU on a single node for 300 epochs, run:

Swin-T:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345  main.py \
--cfg configs/swin/swin_tiny_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 128

Swin-S:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345  main.py \
--cfg configs/swin/swin_small_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 128

Swin-B:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 12345  main.py \
--cfg configs/swin/swin_base_patch4_window7_224.yaml --data-path <imagenet-path> --batch-size 64 \
--accumulation-steps 2 [--use-checkpoint]

2.Evaluation

python -m torch.distributed.launch --nproc_per_node 1 --master_port 12345 main.py --eval \
--cfg configs/swin/swin_base_patch4_window7_224.yaml --resume swin_base_patch4_window7_224.pth --data-path <imagenet-path>