UNet 网络结构及代码实现

三七2024

已于 2024-05-17 17:02:23 修改

阅读量469

点赞数 9

分类专栏：《图像分割UNet硬核讲解》--深度学习麋了鹿文章标签：深度学习机器学习神经网络

于 2024-05-17 17:01:30 首次发布

本文链接：https://blog.csdn.net/m0_72800308/article/details/138966867

版权

《图像分割UNet硬核讲解》--深度学习麋了鹿专栏收录该内容

4 篇文章 0 订阅

订阅专栏

本专栏内容是学习深度学习麋了鹿的《图像分割UNet硬核讲解》（带你手撸unet代码）部分笔记。

内容包括从数据集→网络结构→训练→测试。（附代码）

本节是 UNet 网络结构及代码实现笔记。

上节内容，我们进行了UNet 数据集制作及代码实现。

下面开始本节内容。

1.UNet 网络结构

① conv 3x3,ReLu：卷积层，卷积核为3x3，经过ReLu激活。
② copy and crop：复制和裁剪。
③ max pool 2x2：最大池化层，卷积核为2x2。
④ up-conv 2x2：反卷积，卷积核为2x2。
⑤ conv 1x1 ：卷积层，卷积核为1x1。

显然，这个结构就是先对图片进行卷积和池化，再对特征图片上采样或者反卷积。

假设，我们的图片是512*512的，那么就会依次变成256*256，128*128，64*64，32*32不同尺寸的特征图。然后我们对32*32的特征图做上采样或者反卷积，得到64*64的特征图，这个特征图与之前的64*64的特征图进行通道上的拼接concat，然后再对拼接之后的特征图做卷积和上采样，得到128*1286的特征图，由此类推，最终得到一个与输入图像尺寸相同的512*512的预测结果。

2.代码

import torch
from torch import nn
from torch.nn import functional as F


# 卷积板块
class Conv_Block(nn.Module):
    '''初始化'''

    def __init__(self, in_channel, out_channel):
        super(Conv_Block, self).__init__()
        self.layer = nn.Sequential(
            nn.Conv2d(in_channel, out_channel, 3, 1, 1, padding_mode='reflect', bias=False),  # 3*3的卷积，步长为1，padding=1
            nn.BatchNorm2d(out_channel),
            nn.Dropout2d(0.3),  # # 表示每个神经元有0.3的可能性不被激活
            nn.LeakyReLU(),

            nn.Conv2d(out_channel, out_channel, 3, 1, 1, padding_mode='reflect', bias=False),
            nn.BatchNorm2d(out_channel),
            nn.Dropout2d(0.3),
            nn.LeakyReLU()
        )

    def forward(self, x):
        return self.layer(x)


# 下采样原图中 是最大池化，但这里用了卷积，原因是池化丢失特征多，采用卷积来代替池化
class DownSample(nn.Module):
    def __init__(self, channel):
        super(DownSample, self).__init__()
        self.layer = nn.Sequential(
            nn.Conv2d(channel, channel, 3, 2, 1, padding_mode='reflect', bias=False),  # 3*3的卷积，步长为2，padding=1
            nn.BatchNorm2d(channel),
            nn.LeakyReLU()
        )

    def forward(self, x):
        return self.layer(x)


# 上采样
class UpSample(nn.Module):
    def __init__(self, channel):
        super(UpSample, self).__init__()
        self.layer = nn.Conv2d(channel, channel // 2, 1, 1)  # 通道变为原来的一半，1*1的卷积，步长为1

    def forward(self, x, feature_map):
        # 最邻近插值方法
        up = F.interpolate(x, scale_factor=2, mode='nearest')
        out = self.layer(up)
        return torch.cat((out, feature_map), dim=1)  # 进行拼接  NCHW


class UNet(nn.Module):
    def __init__(self):
        super(UNet, self).__init__()
        self.c1 = Conv_Block(3, 64)
        self.d1 = DownSample(64)
        self.c2 = Conv_Block(64, 128)
        self.d2 = DownSample(128)
        self.c3 = Conv_Block(128, 256)
        self.d3 = DownSample(256)
        self.c4 = Conv_Block(256, 512)
        self.d4 = DownSample(512)
        self.c5 = Conv_Block(512, 1024)
        self.u1 = UpSample(1024)
        self.c6 = Conv_Block(1024, 512)
        self.u2 = UpSample(512)
        self.c7 = Conv_Block(512, 256)
        self.u3 = UpSample(256)
        self.c8 = Conv_Block(256, 128)
        self.u4 = UpSample(128)
        self.c9 = Conv_Block(128, 64)
        self.out = nn.Conv2d(64, 3, 3, 1, 1)  # 输出彩色图像三通道
        self.Th = nn.Sigmoid()

    def forward(self, x):
        R1 = self.c1(x)
        R2 = self.c2(self.d1(R1))
        R3 = self.c3(self.d2(R2))
        R4 = self.c4(self.d3(R3))
        R5 = self.c5(self.d4(R4))
        e1 = self.c6(self.u1(R5, R4))
        e2 = self.c7(self.u2(e1, R3))
        e3 = self.c8(self.u3(e2, R2))
        e4 = self.c9(self.u4(e3, R1))

        return self.Th(self.out(e4))


if __name__ == '__main__':
    x = torch.randn(2, 3, 256, 256)
    net = UNet()
    print(net(x).shape)

输出结果：