需要预先安装pytorch与cuda
clone mmpretrain源码,安装openmim
git clone https://github.com/open-mmlab/mmpretrain
cd mmpretrain
pip install openmim
安装mmpretrain算法库,注意要在mmpretrain目录下
mim install -e “.[multimodal]”
后进入python环境
In[1]:import mmpretrain
In[2]:mmpretrain.__version__
In[3]:from mmpretrain import get_model, list_models, inference_model
In[4]:list_models(task='Image Classification', pattern='resnet18') #获取分类任务相关的名字带有resnet18的所有模型
In[5]:list_models(task='Image Caption', pattern='blip') #获取图像描述相关的名字带有blip的所有模型
In[6]: model = get_model('resnet18_8xb16_cifar10') #获取resnet18在cifar10上的模型
In[7]:type(model) #获取类型--是ImageClassifier
In[8]:type(model.backbone)#应该是ResNet类型
In[9]: inference.model(model, 'demo/bird.jpeg', show=True)#此时模型未训练结果混乱
#直接get_model没有额外参数获得的模型是没有预训练权重的
In[10]:inference.model('resnet18_8xb16_cifar10'', 'demo/bird.jpeg',show=True)#正确用法
接下来进行在一个数据集用ResNet的微调训练
准备数据集
#从clone的mmpretrain目录出发
mkdir data
cd data
tar -xf ~/Downloads/cats_dogs_dataset.tar#提前下载的数据集https://download.openmmlab.com/mmclassification/dataset/cats_dogs_dataset.tar
cd cats_dogs_dataset
cd ../../
#编写配置文件
#config/resnet目录有有可参考的ResNet18的配置文件
#使用最简单resnet18_8xb32_in1k.py
vi configs/resnet/resnet18_8xb32_in1k.py
这里将四个文件的配置信息组合在了一起:
模型配置文件:一些初始化参数,backbone主干网络,一个resnet网络用于特征提取,neck是pooling层把backbone提取每张图的特征转为1维向量,head分类头。
可以用如下方法修改配置信息
from mmengine impor Config
cfg = Config.fromfile("./configs/resnet/resnet18_8xb32_in1k.py")
cfg.model
cfg.model.head.num_classes = 2 #修改配置
cfg。model
数据集配置:
# dataset settings
dataset_type = 'ImageNet'
data_preprocessor = dict(
num_classes=1000,
# RGB format normalization parameters
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
# convert image from BGR to RGB
to_rgb=True,
)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', scale=224),
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
dict(type='PackInputs'),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='ResizeEdge', scale=256, edge='short'),
dict(type='CenterCrop', crop_size=224),
dict(type='PackInputs'),
]
train_dataloader = dict(
batch_size=32,
num_workers=5,
dataset=dict(
type=dataset_type,
data_root='data/imagenet',
ann_file='meta/train.txt',
data_prefix='train',
pipeline=train_pipeline),
sampler=dict(type='DefaultSampler', shuffle=True),
)
val_dataloader = dict(
batch_size=32,
num_workers=5,
dataset=dict(
type=dataset_type,
data_root='data/imagenet',
ann_file='meta/val.txt',
data_prefix='val',
pipeline=test_pipeline),
sampler=dict(type='DefaultSampler', shuffle=False),
)
val_evaluator = dict(type='Accuracy', topk=(1, 5))
# If you want standard test, please manually configure the test dataset
test_dataloader = val_dataloader
test_evaluator = val_evaluator
注意:配置文件在加载完成后,中间变量的连接关系就没有了,即train_dataloader的type与变量dataset_type没有连接了。
规划配置:
记录了训练和测试的流程,optim_wrapper配置了优化器参数;param_scheduler参数归化器,用于训练期间根据训练结果修改学习率等参数
运行参数配置:
# defaults to use registries in mmpretrain
default_scope = 'mmpretrain'
# configure default hooks
default_hooks = dict(
# record the time of every iteration.
timer=dict(type='IterTimerHook'),
# print log every 100 iterations.
logger=dict(type='LoggerHook', interval=100),
# enable the parameter scheduler.
param_scheduler=dict(type='ParamSchedulerHook'),
# save checkpoint per epoch.
checkpoint=dict(type='CheckpointHook', interval=1),
# set sampler seed in distributed evrionment.
sampler_seed=dict(type='DistSamplerSeedHook'),
# validation results visualization, set True to enable it.
visualization=dict(type='VisualizationHook', enable=False),
)
# configure environment
env_cfg = dict(
# whether to enable cudnn benchmark
cudnn_benchmark=False,
# set multi process parameters
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
# set distributed parameters
dist_cfg=dict(backend='nccl'),
)
# set visualizer
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(type='UniversalVisualizer', vis_backends=vis_backends)
# set log level
log_level = 'INFO'
# load from which checkpoint
load_from = None
# whether to resume training from the loaded checkpoint
resume = False
# Defaults to use random seed and disable `deterministic`
randomness = dict(seed=None, deterministic=False)
指和实验相关的附加功能,环境配置等参数,大部分情况不需要修改。介绍几个常用的
logger日志配置,interval=100表示间隔100次迭代打印日常
checkpoint权重保存配置,可以设置max_keep_ckpys=5参数保存最后几个权重,save_best='auto'就会保存目前为止精度最高的模型
randomness可以设置随机数种子
在projects目录下新建一个目录存放自定义配置文件
mkdir projects/cat_dog
cd projects/cat_dog
vi resnent18_finetune.py
将上述四个配置文件内容全部复制到这个新建文件中。
然后加载预训练权重,
在模型配置处加入init_cfg
# model settings
model = dict(
type='ImageClassifier',
backbone=dict(
type='ResNet',
depth=18,
num_stages=4,
out_indices=(3, ),
style='pytorch'),
neck=dict(type='GlobalAveragePooling'),
head=dict(
type='LinearClsHead',
num_classes=1000,
in_channels=512,
loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
topk=(1, 5),
)
init_cfg=dict(type='Rewtrained', checkpoint='xxx')#xxx为模型链接
)
在数据集配置中修改dataset_type = 'CustomDataset';修改train_dataloder等中的data_root为新数据路径,然后删除无关数据。将评测指标相关配置的的val_evaluator中的topk改成1
# dataset settings
dataset_type = 'CustomDataset'
data_preprocessor = dict(
num_classes=1000,
# RGB format normalization parameters
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
# convert image from BGR to RGB
to_rgb=True,
)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='RandomResizedCrop', scale=224),
dict(type='RandomFlip', prob=0.5, direction='horizontal'),
dict(type='PackInputs'),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='ResizeEdge', scale=256, edge='short'),
dict(type='CenterCrop', crop_size=224),
dict(type='PackInputs'),
]
train_dataloader = dict(
batch_size=32,
num_workers=5,
dataset=dict(
type=dataset_type,
data_root='../../data/cats_dogs_dataset/training_set',
pipeline=train_pipeline),
sampler=dict(type='DefaultSampler', shuffle=True),
)
val_dataloader = dict(
batch_size=32,
num_workers=5,
dataset=dict(
type=dataset_type,
data_root='../../data/cats_dogs_dataset/val_set/',
pipeline=test_pipeline),
sampler=dict(type='DefaultSampler', shuffle=False),
)
val_evaluator = dict(type='Accuracy', topk=1)
# If you want standard test, please manually configure the test dataset
val_dataloader = dict(
batch_size=32,
num_workers=5,
dataset=dict(
type=dataset_type,
data_root='../../data/cats_dogs_dataset/test_set/',
pipeline=test_pipeline),
sampler=dict(type='DefaultSampler', shuffle=False),
)
test_evaluator = val_evaluator
最后修改训练配置将epoch改成5,因为有加载预训练权重,模型收敛很快。
配置文件准备好就可以开始训练了
使用mim命令可以在任意位置很容易开始训练,传入训练脚本与保存位置
mim train mmpretrain resnet18_finetune.py --work-dir=./exp
同样,mim test mmpretrain传入模型与训练出的权重文件就可以进行测试
mim test mmpretrain resnet18_finetune.py --checkpoint exp/epoch_5.pth
mim test mmpretrain resnet18_finetune.py --checkpoint exp/epoch_5.pth --put result.pkl
加上--out result.pkl就可以将每个样本的测试结果都保存在result.pkl文件里
然后就可以使用mmpretrain中的各种分析工具进行分析,例如
mim run mmpretrain analyze_result resnet18_finetune.py result.pkl --out-dir analyze
cd analuze
ls
tree ./
就可以看到测试正确和错误结果以及每张图测试得分。
使用ImageClassificationInferencer API这样inferen就可以进行单张图片的预测
from mmpretrain import ImageClassificationInferencer
inferencer = ImageClassificationInferencer('./resnet18_finetune.py', pretrained='exp/epoch_5.pth')
inferen("xxx.jpg", show=Tyre)