官方文档
getting_started.md:
如何test a model:
如何train a model: python tools/train.py ${CONFIG_FILE} [optional arguments]
train(tools/train.py)
1.模型构建
配置参数
model = dict(
type='ChangeExtraSegmentor',
backbone=dict(
type='ChangeExtraResNet',
depth=50,
use_IN1=True,
pretrained='/mnt/lustre/wujiang/.cache/torch/checkpoints/' \
+ 'resnet50_v1c-2cccc1ad.pth',
base_channels=64,
num_stages=4,
out_indices=(0, 1, 2, 3),
dilations=(1, 1, 2, 4),
strides=(1, 2, 1, 1),
deep_stem=True,
norm_cfg=norm_cfg,
norm_eval=False,
contract_dilation=True,
style='pytorch',
model = build_segmentor( cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
build_from_cfg(cfg_, registry, default_args) for cfg_ in cfg
见mmcv.utils.registry
def build_from_cfg(cfg, registry, default_args=None):
"""Build a module from config dict.
Args:
cfg (dict): Config dict. It should at least contain the key "type".
registry (:obj:`Registry`): The registry to search the type from.
default_args (dict, optional): Default initialization arguments.
Returns:
object: The constructed object.
"""
obj_type = args.pop('type')
obj_cls = registry.get(obj_type) #根据网络名字获得实际的网络类
obj_cls(**args) #实例化网络类
2.训练
train_segmentor(model,datasets,cfg)
runner = IterBasedRunner(model=model,batch_processor=None,optimizer=optimizer, work_dir=cfg.work_dir, logger=logger, meta=meta)
runner.run(data_loaders, cfg.workflow, cfg.total_iters)
其中IterBasedRunner见mmcv.runner.iter_based_runner.py
while self.iter < self._max_iters:
for i, flow in enumerate(workflow):
self._inner_iter = 0
mode, iters = flow
iter_runner = getattr(self, mode)
for _ in range(iters):
iter_runner(iter_loaders[i], **kwargs)
具体例子讲解
args:config=‘configs/_rs_waterRGB/fcn_hr48_896x896_4k__ShCHY_dgl__oeg_edr2.py’
train.py
模型初始化
cfg = Config.fromfile(args.config)
build_segmentor( cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)
这里的cfg定义了构建模型所需的全部参数:
builder.py return build_from_cfg(cfg, registry, default_args)
mmcv/registry.py build_from_cfg(cfg, registry, default_args=None)
obj_cls = registry.get(obj_type),其中obj_type为EncoderDecoderMultiHead,obj_cls为mmseg.models.segmentors.encoder_decoder_multihead.EncoderDecoderMultiHead
模型构建时,会依次构建backbone,neck,head,auxiliary_head等
self.backbone = builder.build_backbone(backbone) #调用builder.py中build_backbone,进一步调用builder.py中build,mmcv/registry.py build_from_cfg
self.neck = builder.build_neck(neck)
self._init_decode_head(decode_head)
self._init_auxiliary_head(auxiliary_head)
构造模型时传入的cfg
** 模型构建时,会依次构建backbone,neck,head,auxiliary_head等**
class EncoderDecoderMultiHead(EncoderDecoder)
其中backbone的cfg为:
self.backbone = builder.build_backbone(backbone)
build(cfg, BACKBONES)
build_from_cfg(cfg, registry, default_args)
调用HRnet的初始化,HRNet类的定义位于mmseg/models/backbones/hrnet.py,构建了HRnet的4个stage,输出为不同size的feature list
self._init_decode_head(decode_head)
self.decode_head = builder.build_head(decode_head)
build(cfg, HEADS)
其中cfg为FCNHead组成的list,FCNHead的定义位于mmseg/models/decode_heads/fcn_head.py
FCNhead将feature list先上采样到同一分辨率(x = self._transform_inputs(inputs)),然后concat,然后卷积为指定通道数(output = self.convs(x)),然后2个卷积得到输出结果(output = self.cls_seg(output))
head的forward函数会先计算结果,然后根据指定的loss返回损失
head是怎么构建的,怎么和cfg中对应的??
head在不同的任务中有着很大的不同,head的卷积层需要并行,传行和concat等,所以head的构建和forward函数,需要根据配置文件自己写和更改
for i in range(self.num_heads):
self.decode_head.append(builder.build_head(decode_head[i]))
build_from_cfg(cfg, HEAD, default_args)
#调用FCNHead的构建函数
构建self.convs: self.convs = nn.Sequential(*convs),包括了inputchannels->outputchannels和num_convs个卷积
构建self.conv_seg:self.conv_seg = nn.Conv2d(channels, num_classes, kernel_size=1),包括了outputchannels->回归结果的卷积
前向传播的过程
backbone用于提取特征;
head会继续调用self.decode_head[i].forward_train(x, img_metas, gt_seg, self.train_cfg),用于进一步提取特征和计算损失
seg_logits = self.forward(inputs)
x = self._transform_inputs(inputs) #特征list做resize,然后concat
inputs = torch.cat(upsampled_inputs, dim=1)
output = self.convs(x) #根据input_channels,channnels,构建多个卷积层,将input_channels的卷积,卷为channels的卷积
output = self.cls_seg(output) #卷为指定的输出通道数
losses = self.losses(seg_logits, gt_semantic_seg, seg_weight_map=kargs['seg_weight_map'])
计算损失时,会先初始化self.loss_decode:self.loss_decode = build_loss(loss_decode)#调用 mmseg/models/losses/cross_entropy_loss.py类构建损失;然后调用mmseg/models/decode_heads/decode_head.py/losses()函数,该函数利用loss_decode计算损失
主要问题为head要如何构建和前向传播
head负责将backbone中提取的特征得到结果,然后计算损失。
初始化后的模型:
数据dataset构建
datasets = [build_dataset(cfg.data.train)] #<mmseg.datasets.dataset_wrappers.ConcatDataset
runner = IterBasedRunner(model=model,batch_processor=None,optimizer=optimizer,work_dir=cfg.work_dir, logger=logger,meta=meta)
数据部分
dataloader返回的是一个字典,包含img,img_metas,gt_semantic_seg和其它字段,其中img和gt_semantic_seg是tensor,而img_metas是字典,表示图像处理时所记录的文本信息
base.py train_step