Unet模型及代码解析

发呆的比目鱼

已于 2023-04-24 10:37:21 修改

阅读量512

点赞数 2

分类专栏： CV模型文章标签：深度学习 cnn 人工智能

于 2023-04-24 10:17:59 首次发布

原文链接：https://blog.csdn.net/weixin_40604528/article/details/121562771

版权

CV模型专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Unet模型及代码解析

Paper：http://www.arxiv.org/pdf/1505.04597.pdf
Code：https://github.com/jakeret/tf_unet

一个U型网络结构，2015年在图像分割领域大放异彩，unet被大量应用在分割领域。它是在FCN的基础上构建，它的U型结构解决了FCN无法上下文的信息和位置信息的弊端。

模型

主干结构解析

左边为特征提取网络（编码器），右边为特征融合网络（解码器）
高分辨率—编码—低分辨率—解码—高分辨率

前半部分是编码, 它的作用是特征提取(获取局部特征,并做图片级分类)，得到抽象语义特征
由两个3x3的卷积层（RELU）再加上一个2x2的maxpooling层组成一个下采样的模块，一共经过4次这样的操作

后半部分利用前面编码的抽象特征来恢复到原图尺寸的过程, 最终得到分割结果(掩码图片)。由一层反卷积+特征拼接concat+两个3x3的卷积层（ReLU）反复构成，一共经过4次这样的操作，与特征提取网络刚好相对应，最后接一层1*1卷积，降维处理，即将通道数降低至特定的数量，得到目标图。

FCN与UNet特征融合操作对比解析

FCN是通过特征图对应像素值的相加来融合特征的，利用上采样对最后一层特征图进行上采样的话，会损失很多细节，边缘模糊，使用跳跃结构，将最后一层的预测（有丰富的全局信息）和更浅层（有更多的局部细节）的预测结合起来（sum 方式），可以恢复细节信息。

上采样

上采样的三种方式
（1）双线性插值：特点是不需要进行学习，运行速度快，操作简单
（2）反卷积：也叫转置卷积，外圈或中间补零
（3）反池化：记录池化过程中元素在对应 kernel 中的坐标，作为反池化的索引

代码

import torch
from torch import nn
from torch.nn import functional as F#插值法上采样
 
class Conv_Block(nn.Module):
    def __init__(self,in_channel,out_channel):
        super(Conv_Block, self).__init__()
        self.layer=nn.Sequential(
            nn.Conv2d(in_channel,out_channel,3,1,1,padding_mode='reflect',bias=False),#卷积3*3，步长为1，padding为1
            nn.BatchNorm2d(out_channel),
            nn.Dropout2d(0.3),
            nn.LeakyReLU(),
            nn.Conv2d(out_channel, out_channel, 3, 1, 1, padding_mode='reflect', bias=False),
            nn.BatchNorm2d(out_channel),
            nn.Dropout2d(0.3),
            nn.LeakyReLU()
        )
    def forward(self,x):
        return self.layer(x)
 
 
class DownSample(nn.Module):#池化（下采样）
    def __init__(self,channel):
        super(DownSample, self).__init__()
        self.layer=nn.Sequential(#序列构造器
            nn.Conv2d(channel,channel,3,2,1,padding_mode='reflect',bias=False),#这里不采用最大池化，最大池化特征丢失太多，所以采用步长为2
            nn.BatchNorm2d(channel),
            nn.LeakyReLU()
        )
    def forward(self,x):
        return self.layer(x)
 
 
class UpSample(nn.Module):#上采样
    def __init__(self,channel):
        super(UpSample, self).__init__()
        self.layer=nn.Conv2d(channel,channel//2,1,1)#1*1卷积，降低通道，无需特征提取，只是降通道数
    def forward(self,x,feature_map):
        up=F.interpolate(x,scale_factor=2,mode='nearest')#最邻近插值法
        out=self.layer(up)
        return torch.cat((out,feature_map),dim=1)
 
 
class UNet(nn.Module):
    def __init__(self,num_classes):
        super(UNet, self).__init__()
        self.c1=Conv_Block(3,64)
        self.d1=DownSample(64)
        self.c2=Conv_Block(64,128)
        self.d2=DownSample(128)
        self.c3=Conv_Block(128,256)
        self.d3=DownSample(256)
        self.c4=Conv_Block(256,512)
        self.d4=DownSample(512)
        self.c5=Conv_Block(512,1024)
        self.u1=UpSample(1024)
        self.c6=Conv_Block(1024,512)
        self.u2 = UpSample(512)
        self.c7 = Conv_Block(512, 256)
        self.u3 = UpSample(256)
        self.c8 = Conv_Block(256, 128)
        self.u4 = UpSample(128)
        self.c9 = Conv_Block(128, 64)
        self.out=nn.Conv2d(64,num_classes,3,1,1)
 
    def forward(self,x):
        R1=self.c1(x)
        R2=self.c2(self.d1(R1))
        R3 = self.c3(self.d2(R2))
        R4 = self.c4(self.d3(R3))
        R5 = self.c5(self.d4(R4))
        O1=self.c6(self.u1(R5,R4))
        O2 = self.c7(self.u2(O1, R3))
        O3 = self.c8(self.u3(O2, R2))
        O4 = self.c9(self.u4(O3, R1))
 
        return self.out(O4)
 
if __name__ == '__main__':
    x=torch.randn(2,3,256,256)
    net=UNet()
    print(net(x).shape)