Unet学习——逆卷积理解(公式推导)

Unet结构图

结构图: Alt

实现代码

import torch.nn as nn
import torch
from torch import autograd

#把常用的2个卷积操作简单封装下
class DoubleConv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(DoubleConv, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch), #添加了BN层
            nn.ReLU(inplace=True),
            nn.Conv2d(out_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True)
        )

    def forward(self, input):
        return self.conv(input)

class Unet(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(Unet, self).__init__()
        self.conv1 = DoubleConv(in_ch, 64)
        self.pool1 = nn.MaxPool2d(2)
        self.conv2 = DoubleConv(64, 128)
        self.pool2 = nn.MaxPool2d(2)
        self.conv3 = DoubleConv(128, 256)
        self.pool3 = nn.MaxPool2d(2)
        self.conv4 = DoubleConv(256, 512)
        self.pool4 = nn.MaxPool2d(2)
        self.conv5 = DoubleConv(512, 1024)
        # 逆卷积,也可以使用上采样(保证k=stride,stride即上采样倍数)
        self.up6 = nn.ConvTranspose2d(1024, 512, 2, stride=2)
        self.conv6 = DoubleConv(1024, 512)
        self.up7 = nn.ConvTranspose2d(512, 256, 2, stride=2)
        self.conv7 = DoubleConv(512, 256)
        self.up8 = nn.ConvTranspose2d(256, 128, 2, stride=2)
        self.conv8 = DoubleConv(256, 128)
        self.up9 = nn.ConvTranspose2d(128, 64, 2, stride=2)
        self.conv9 = DoubleConv(128, 64)
        self.conv10 = nn.Conv2d(64, out_ch, 1)

    def forward(self, x):
        c1 = self.conv1(x)
        p1 = self.pool1(c1)
        c2 = self.conv2(p1)
        p2 = self.pool2(c2)
        c3 = self.conv3(p2)
        p3 = self.pool3(c3)
        c4 = self.conv4(p3)
        p4 = self.pool4(c4)
        c5 = self.conv5(p4)
        up_6 = self.up6(c5)
        merge6 = torch.cat([up_6, c4], dim=1)
        c6 = self.conv6(merge6)
        up_7 = self.up7(c6)
        merge7 = torch.cat([up_7, c3], dim=1)
        c7 = self.conv7(merge7)
        up_8 = self.up8(c7)
        merge8 = torch.cat([up_8, c2], dim=1)
        c8 = self.conv8(merge8)
        up_9 = self.up9(c8)
        merge9 = torch.cat([up_9, c1], dim=1)
        c9 = self.conv9(merge9)
        c10 = self.conv10(c9)
        out = nn.Sigmoid()(c10)
        return out

逆卷积理解nn.ConvTranspose2d

函数原型

torch.nn.ConvTranspose2d(in_channels, out_channels, 
	kernel_size, stride=1, padding=0, output_padding=0, 
	groups=1, bias=True, dilation=1, padding_mode='zeros', 
	device=None, dtype=None)

描述

对由多个输入平面(通道)组成的输入图像应用二维转置卷积算子。这个模块可以看作Conv2d相对于其输入的梯度。它也被称为分步卷积或反卷积(尽管它不是实际的反卷积操作)。想要了解逆卷积,需要从常规卷积入手。

常规卷积的操作

函数原型:

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, 
groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
  • 输入: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
  • 输出: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)

其中:
H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 s t r i d e [ 0 ] + 1 ⌋ H_{out} = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1+1
变形为:
H o u t = H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 s t r i d e [ 0 ] + 1 − α H_{out} = \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) - 1 }{stride[0]} + 1 - \alpha Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1+1α
其中 α \alpha α表示向下取整减去的小数部分, α ∈ [ 0 , 1 ) \alpha \in [0,1) α[0,1),且 α × s t r i d e [ 0 ] ∈ Z + \alpha \times stride[0] \in \bold{Z}^{+} α×stride[0]Z+
则:
H i n = ( H o u t − 1 ) × s t r i d e l [ 0 ] − 2 × p a d d i n g [ 0 ] + d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) + 1 + α × s t r i d e [ 0 ] H_{in} = ( H_{out} - 1 ) \times stridel[0] - 2 \times padding[0] + dilation[0] \times (kernel\_size[0] - 1 ) + 1 + \alpha \times stride[0] Hin=(Hout1)×stridel[0]2×padding[0]+dilation[0]×(kernel_size[0]1)+1+α×stride[0]
上面的公式就是表示由常规卷积的输出尺寸反求输入尺寸,而逆卷积也就是由常规卷积的输出结果反求输入结果

  • 输入: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
  • 输出: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)

其中:
H o u t = ( H i n − 1 ) × s t r i d e l [ 0 ] − 2 × p a d d i n g [ 0 ] + d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) + 1 + o u t p u t _ p a d d i n g [ 0 ] W o u t = ( W i n − 1 ) × s t r i d e l [ 1 ] − 2 × p a d d i n g [ 1 ] + d i l a t i o n [ 1 ] × ( k e r n e l _ s i z e [ 1 ] − 1 ) + 1 + o u t p u t _ p a d d i n g [ 1 ] H_{out} = ( H_{in} - 1 ) \times stridel[0] - 2 \times padding[0] + dilation[0] \times (kernel\_size[0] - 1 ) + 1 + output\_padding[0] \\ W_{out} = ( W_{in} - 1 ) \times stridel[1] - 2 \times padding[1] + dilation[1] \times (kernel\_size[1] - 1 ) + 1 + output\_padding[1] Hout=(Hin1)×stridel[0]2×padding[0]+dilation[0]×(kernel_size[0]1)+1+output_padding[0]Wout=(Win1)×stridel[1]2×padding[1]+dilation[1]×(kernel_size[1]1)+1+output_padding[1]

仅仅是将 α × s t r i d e [ 0 ] \alpha \times stride[0] α×stride[0]换成 o u t p u t _ p a d d i n g output\_padding output_padding而已,也正好说明了 o u t p u t _ p a d d i n g output\_padding output_padding的来源:常规卷积有时会无法利用所有原始信息,有边角料未利用,如:input=5x5,padding=0,kernel_size=2x2,stride=2这种情况,为了产生相同尺寸的特征,那么在逆卷积对应回去的时候,通过 o u t p u t _ p a d d i n g output\_padding output_padding解决多尺寸的问题。当然如果卷积的时候取 s t r i d e [ 0 ] = 1 stride[0]=1 stride[0]=1就没有 o u t p u t _ p a d d i n g output\_padding output_padding的烦恼了!

对比卷积公式,对逆卷积公式变形( d i l a t i o n = 1 dilation=1 dilation=1)

H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 s t r i d e [ 0 ] + 1 ⌋ = ⌊ H i n + 2 × p a d d i n g [ 0 ] − k e r n e l _ s i z e [ 0 ] s t r i d e [ 0 ] + 1 ⌋ \begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times padding[0] - kernel\_size[0] }{ stride[0] } + 1 \rfloor \\ \end{aligned} Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1+1=stride[0]Hin+2×padding[0]kernel_size[0]+1
H o u t = ( H i n − 1 ) × s t r i d e l [ 0 ] − 2 × p a d d i n g [ 0 ] + d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) + 1 + o u t p u t _ p a d d i n g [ 0 ] = H i n × s t r i d e l [ 0 ] − s t r i d e l [ 0 ] − 2 × p a d d i n g [ 0 ] + k e r n e l _ s i z e [ 0 ] + o u t p u t _ p a d d i n g [ 0 ] = ⌊ H i n + ( H i n − 1 ) × ( s t r i d e l [ 0 ] − 1 ) + 2 × ( k e r n e l _ s i z e [ 0 ] − p a d d i n g [ 0 ] − 1 ) − k e r n e l _ s i z e [ 0 ] 1 + 1 ⌋ + o u t p u t _ p a d d i n g [ 0 ] \begin{aligned} H_{out} & = ( H_{in} - 1 ) \times stridel[0] - 2 \times padding[0] + dilation[0] \times (kernel\_size[0] - 1 ) + 1 + output\_padding[0] \\ & = H_{in} \times stridel[0] - stridel[0] - 2 \times padding[0] + kernel\_size[0] + output\_padding[0] \\ & = \lfloor \frac{ H_{in} + ( H_{in} - 1 ) \times ( stridel[0] - 1) + 2 \times ( kernel\_size[0] - padding[0] -1 ) - kernel\_size[0] }{ 1 } + 1 \rfloor + output\_padding[0] \\ \end{aligned} Hout=(Hin1)×stridel[0]2×padding[0]+dilation[0]×(kernel_size[0]1)+1+output_padding[0]=Hin×stridel[0]stridel[0]2×padding[0]+kernel_size[0]+output_padding[0]=1Hin+(Hin1)×(stridel[0]1)+2×(kernel_size[0]padding[0]1)kernel_size[0]+1+output_padding[0]
由上面的公式可以看出,逆卷积相当于是对一个新的特征进行常规卷积, H i n + ( H i n − 1 ) × ( s t r i d e l [ 0 ] − 1 ) H_{in} + ( H_{in} - 1 ) \times ( stridel[0] - 1) Hin+(Hin1)×(stridel[0]1)表示在高度方向的每两相邻行(列)中间插入 s t r i d e l [ 0 ] − 1 stridel[0] - 1 stridel[0]1行(列)零,然后对新特征的进行 k e r n e l _ s i z e [ 0 ] kernel\_size[0] kernel_size[0]不变, s t r i d e n e w [ 0 ] = 1 stride_{new}[0] =1 stridenew[0]=1, p a d d i n g n e w [ 0 ] = k e r n e l _ s i z e [ 0 ] − p a d d i n g [ 0 ] − 1 padding_{new}[0] =kernel\_size[0] - padding[0] -1 paddingnew[0]=kernel_size[0]padding[0]1的常规卷积,再加上 o u t p u t _ p a d d i n g [ 0 ] output\_padding[0] output_padding[0]得到结果,实际上也是这么操作的,如下图所示。
图

逆卷积操作步骤

  • 第一步:对输入的特征图a进行一些变换,得到新的特征图
  • 第二步:求新的卷积核设置,得到新的卷积核设置
  • 第三步:用新的卷积核在新的特征图上做常规的卷积,得到的结果就是逆卷积的结果,就是我们要求的结果。

也就是说最后还是通过常规的卷积得到结果。

对上文Unet网络中逆卷积参数的解释

上文中的DoubleConv是双层卷积,定义如下:

self.conv = nn.Sequential(
            nn.Conv2d(in_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch), #添加了BN层
            nn.ReLU(inplace=True),
            nn.Conv2d(out_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True)
        )

其中nn.BatchNorm2dnn.ReLU不改变数据形状。由卷积公式:

H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 s t r i d e [ 0 ] + 1 ⌋ = ⌊ H i n + 2 × p a d d i n g [ 0 ] − ( k e r n e l _ s i z e [ 0 ] − 1 ) ⌋ = H i n + 2 × p a d d i n g [ 0 ] − k e r n e l _ s i z e [ 0 ] + 1 \begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor \\ & = \lfloor H_{in} + 2 \times padding[0] - (kernel\_size[0] - 1 ) \rfloor \\ & = H_{in} + 2 \times padding[0] - kernel\_size[0] + 1 \\ \end{aligned} Hout=stride[0]Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1+1=Hin+2×padding[0](kernel_size[0]1)=Hin+2×padding[0]kernel_size[0]+1
所以两层卷积的结果就是:
H o u t = ( H i n + 2 × p a d d i n g 1 [ 0 ] − k e r n e l _ s i z e 1 [ 0 ] + 1 ) + 2 × p a d d i n g 2 [ 0 ] − k e r n e l _ s i z e 2 [ 0 ] + 1 = H i n + 2 × ( p a d d i n g 1 [ 0 ] + p a d d i n g 2 [ 0 ] ) − k e r n e l _ s i z e 1 [ 0 ] − k e r n e l _ s i z e 2 [ 0 ] + 2 \begin{aligned} H_{out} & = (H_{in} + 2 \times padding_1[0] - kernel\_size_1[0] + 1 ) + 2 \times padding_2[0] - kernel\_size_2[0] + 1 \\ & = H_{in} + 2 \times (padding_1[0] + padding_2[0] ) - kernel\_size_1[0] - kernel\_size_2[0] + 2 \\ \end{aligned} Hout=(Hin+2×padding1[0]kernel_size1[0]+1)+2×padding2[0]kernel_size2[0]+1=Hin+2×(padding1[0]+padding2[0])kernel_size1[0]kernel_size2[0]+2
池化层(形状变换相当于卷积)原型:

torch.nn.MaxPool2d(kernel_size, stride=kernel_size, padding=0, 
dilation=1, return_indices=False, 
ceil_mode=False)

H o u t = ⌊ H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 k e r n e l _ s i z e [ 0 ] + 1 ⌋ = ⌊ H i n − k e r n e l _ s i z e 3 [ 0 ] k e r n e l _ s i z e 3 [ 0 ] + 1 ⌋ = ⌊ H i n k e r n e l _ s i z e 3 [ 0 ] ⌋ \begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{ kernel\_size[0] } + 1 \rfloor \\ & = \lfloor \frac{ H_{in} - kernel\_size_3[0] }{ kernel\_size_3[0] } + 1 \rfloor \\ & = \lfloor \frac{ H_{in} }{ kernel\_size_3[0] } \rfloor \\ \end{aligned} Hout=kernel_size[0]Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1+1=kernel_size3[0]Hinkernel_size3[0]+1=kernel_size3[0]Hin
所以结果为:
H o u t = ⌊ H i n + 2 × ( p a d d i n g 1 [ 0 ] + p a d d i n g 2 [ 0 ] ) − k e r n e l _ s i z e 1 [ 0 ] − k e r n e l _ s i z e 2 [ 0 ] + 2 k e r n e l _ s i z e 3 [ 0 ] ⌋ = ⌊ H i n + 2 × ( p a d d i n g 1 [ 0 ] + p a d d i n g 2 [ 0 ] ) − k e r n e l _ s i z e 1 [ 0 ] − k e r n e l _ s i z e 2 [ 0 ] + 2 k e r n e l _ s i z e 3 [ 0 ] ⌋ = ⌊ H i n + 2 × ( 2 ) − 6 + 2 2 ⌋ = ⌊ H i n − 2 2 + 1 ⌋ = ⌊ H i n + 2 × p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] × ( k e r n e l _ s i z e [ 0 ] − 1 ) − 1 s t r i d e [ 0 ] + 1 ⌋ = ⌊ H i n − k e r n e l _ s i z e [ 0 ] s t r i d e [ 0 ] + 1 ⌋ \begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times (padding_1[0] + padding_2[0] ) - kernel\_size_1[0] - kernel\_size_2[0] + 2 }{ kernel\_size_3[0] } \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times (padding_1[0] + padding_2[0] ) - kernel\_size_1[0] - kernel\_size_2[0] + 2 }{ kernel\_size_3[0] } \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times ( 2 ) - 6 + 2 }{ 2 } \rfloor \\ & = \lfloor \frac{ H_{in} - 2 }{ 2 } + 1 \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor \\ & = \lfloor \frac{ H_{in} - kernel\_size[0] }{ stride[0] } + 1 \rfloor \\ \end{aligned} Hout=kernel_size3[0]Hin+2×(padding1[0]+padding2[0])kernel_size1[0]kernel_size2[0]+2=kernel_size3[0]Hin+2×(padding1[0]+padding2[0])kernel_size1[0]kernel_size2[0]+2=2Hin+2×(2)6+2=2Hin2+1=stride[0]Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1+1=stride[0]Hinkernel_size[0]+1
所以 s t r i d e = 2 , k e r n e r _ s i z e = 2 stride=2,kerner\_size=2 stride=2,kerner_size=2。当然也可以有其他组合,这里为了方便令 p a d d i n g = 0 , d i l a t i o n = 1 padding=0,dilation=1 padding=0,dilation=1,通道数的计算简单,这里就不列了。

  • 4
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值