MMSegmention系列之三(基本的网络架构和预训练模型)

1、常见设置

•我们默认使用4 gpu的分布式训练
ImageNet上所有的pytorch style预训练骨干都是我们自己训练的,步骤与paper相同。我们的ResNet风格骨干是基于ResNetV1c变体,其中输入干中的7x7卷积被三个3x3卷积取代
•对于不同硬件的一致性,我们报告GPU内存为torch.cuda.max_memory_allocated()对于所有4个带有torch.cudnn.benchmark=False的GPU的最大值。注意,这个值通常小于nvidia-smi显示的值
我们报告推理时间为网络转发和后处理的总时间,不包括数据加载时间。结果是通过脚本工具/benchmark.py获得的,该脚本工具使用torch.backends.cudnn计算200幅图像的平均时间 torch.backends.cudnn. benchmark=False.
该框架中有两种推理模式
slide模式:
test_cfg将类似于dict (mode=‘slide’, crop_size=(769,769), stride=(513,513))。在这种模式下,将输入图像裁剪出多个patch,分别传入网络。 crop size大小和步长由作物大小和步长指定。重叠区域按平均合并整个模式:
test_cfg将类似于dict (mode= ‘whole’)。在这种模式下,整个图像将直接传递到网络中。默认情况下,我们对769x769训练模型使用滑动推理,对其余模型使用整体推理
•对于输入大小为8x+1(例如769)的情况,传统做法采用align_corner=True。否则,对于8x(例如512,1024)的输入大小,align_corner=False被采用。

2、Baselines(其他模型是通过这些基线模型演变过来的)

目前我们支持下列 EncoderDecoder 类型的方法:

1、FCN

2、PSPNet

3、DeepLabV3

4、PSANet

Please refer to PSANet for details.

5、DeepLabV3+

Please refer to DeepLabV3+ for details.

6、UPerNet

Please refer to UPerNet for details.

7、NonLocal Net

Please refer to NonLocal Net for details.

8、EncNet

Please refer to EncNet for details.

9、 CCNet

Please refer to CCNet for details.

10、DANet

Please refer to DANet for details.

11、APCNet

Please refer to APCNet for details.

12、HRNet

Please refer to HRNet for details.

13、GCNet

Please refer to GCNet for details.

14、DMNet

Please refer to DMNet for details.

15、ANN

Please refer to ANN for details.

16、OCRNet

Please refer to OCRNet for details.

17、Fast-SCNN

Please refer to Fast-SCNN for details.

18 ResNeSt

Please refer to ResNeSt for details.

19、Semantic FPN

Please refer to Semantic FPN for details.

20、 PointRend

Please refer to PointRend for details.

MobileNetV2

Please refer to MobileNetV2 for details.

MobileNetV3

Please refer to MobileNetV3 for details.

EMANet

Please refer to EMANet for details.

DNLNet

Please refer to DNLNet for details.

CGNet

Please refer to CGNet for details.

Mixed Precision (FP16) Training
Please refer Mixed Precision (FP16) Training on BiSeNetV2 for details.

U-Net

Please refer to U-Net for details.

ViT

Please refer to ViT for details.

Swin

Please refer to Swin for details.

SETR

Please refer to SETR for details.

Speed benchmark

3、统计模型

[ALGORITHM] ANN (16 ckpts)

[ALGORITHM] APCNet (12 ckpts)

[BACKBONE] BEiT (2 ckpts)

[ALGORITHM] BiSeNetV1 (11 ckpts)

[ALGORITHM] BiSeNetV2 (4 ckpts)

[ALGORITHM] CCNet (16 ckpts)

[ALGORITHM] CGNet (2 ckpts)

[BACKBONE] ConvNeXt (6 ckpts)

[ALGORITHM] DANet (16 ckpts)

[ALGORITHM] DeepLabV3 (41 ckpts)

[ALGORITHM] DeepLabV3+ (42 ckpts)

[ALGORITHM] DMNet (12 ckpts)

[ALGORITHM] DNLNet (12 ckpts)

[ALGORITHM] DPT (1 ckpts)

[ALGORITHM] EMANet (4 ckpts)

[ALGORITHM] EncNet (12 ckpts)

[ALGORITHM] ERFNet (1 ckpts)

[ALGORITHM] FastFCN (12 ckpts)

[ALGORITHM] Fast-SCNN (1 ckpts)

[ALGORITHM] FCN (41 ckpts)

[ALGORITHM] GCNet (16 ckpts)

[BACKBONE] HRNet (37 ckpts)

[ALGORITHM] ICNet (12 ckpts)

[ALGORITHM] ISANet (16 ckpts)

[ALGORITHM] K-Net (7 ckpts)

[BACKBONE] MAE (1 ckpts)

[BACKBONE] MobileNetV2 (8 ckpts)

[BACKBONE] MobileNetV3 (4 ckpts)

[ALGORITHM] NonLocal Net (16 ckpts)

[ALGORITHM] OCRNet (24 ckpts)

[ALGORITHM] PointRend (4 ckpts)

[ALGORITHM] PSANet (16 ckpts)

[ALGORITHM] PSPNet (54 ckpts)

[BACKBONE] ResNeSt (8 ckpts)

[ALGORITHM] SegFormer (13 ckpts)

[ALGORITHM] Segmenter (5 ckpts)

[ALGORITHM] Semantic FPN (4 ckpts)

[ALGORITHM] SETR (7 ckpts)

[ALGORITHM] STDC (4 ckpts)

[BACKBONE] Swin Transformer (6 ckpts)

[BACKBONE] Twins (12 ckpts)

[ALGORITHM] UNet (25 ckpts)

[ALGORITHM] UPerNet (16 ckpts)

[BACKBONE] Vision Transformer (11 ckpts)

  • 3
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
你可以使用torchvision内置的预训练模型来实现UNet模型,例如ResNet、VGG等。这些模型已经在ImageNet上进行了预训练,可以直接用于图像分割任务。 下面是一个使用ResNet50作为编码器的UNet模型的示例代码: ```python import torch import torch.nn as nn import torchvision.models as models class DoubleConv(nn.Module): def __init__(self, in_channels, out_channels): super().__init__() self.conv = nn.Sequential( nn.Conv2d(in_channels, out_channels, 3, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU(inplace=True), nn.Conv2d(out_channels, out_channels, 3, padding=1), nn.BatchNorm2d(out_channels), nn.ReLU(inplace=True) ) def forward(self, x): return self.conv(x) class UNet(nn.Module): def __init__(self, n_classes): super().__init__() self.encoder = models.resnet50(pretrained=True) self.pool = nn.MaxPool2d(2, 2) self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True) self.conv0 = DoubleConv(3, 64) self.conv1 = DoubleConv(64, 128) self.conv2 = DoubleConv(128, 256) self.conv3 = DoubleConv(256, 512) self.conv4 = DoubleConv(512, 1024) self.center = DoubleConv(2048, 1024) self.dec4 = DoubleConv(1024 + 512, 512) self.dec3 = DoubleConv(512 + 256, 256) self.dec2 = DoubleConv(256 + 128, 128) self.dec1 = DoubleConv(128 + 64, 64) self.final = nn.Conv2d(64, n_classes, 1) def forward(self, x): conv0 = self.conv0(x) conv1 = self.conv1(self.pool(conv0)) conv2 = self.conv2(self.pool(conv1)) conv3 = self.conv3(self.pool(conv2)) conv4 = self.conv4(self.pool(conv3)) center = self.center(torch.cat([conv4, self.encoder.conv1(conv3)], dim=1)) dec4 = self.dec4(torch.cat([center, conv4], dim=1)) dec3 = self.dec3(torch.cat([self.up(dec4), conv3], dim=1)) dec2 = self.dec2(torch.cat([self.up(dec3), conv2], dim=1)) dec1 = self.dec1(torch.cat([self.up(dec2), conv1], dim=1)) return self.final(dec1) ``` 在这个模型中,首先使用预训练的ResNet50作为编码器,然后添加了几个上采样和下采样的模块,最后通过一个1x1卷积层输出预测结果。这个模型可以用于二分类任务,如果要进行多分类任务,只需要将最后一个卷积层的输出通道数改为类别数即可。 在实际使用中,可以根据自己的任务需求进行调整和修改。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值