Unet结构图
结构图:
实现代码
import torch.nn as nn
import torch
from torch import autograd
#把常用的2个卷积操作简单封装下
class DoubleConv(nn.Module):
def __init__(self, in_ch, out_ch):
super(DoubleConv, self).__init__()
self.conv = nn.Sequential(
nn.Conv2d(in_ch, out_ch, 3, padding=1),
nn.BatchNorm2d(out_ch), #添加了BN层
nn.ReLU(inplace=True),
nn.Conv2d(out_ch, out_ch, 3, padding=1),
nn.BatchNorm2d(out_ch),
nn.ReLU(inplace=True)
)
def forward(self, input):
return self.conv(input)
class Unet(nn.Module):
def __init__(self, in_ch, out_ch):
super(Unet, self).__init__()
self.conv1 = DoubleConv(in_ch, 64)
self.pool1 = nn.MaxPool2d(2)
self.conv2 = DoubleConv(64, 128)
self.pool2 = nn.MaxPool2d(2)
self.conv3 = DoubleConv(128, 256)
self.pool3 = nn.MaxPool2d(2)
self.conv4 = DoubleConv(256, 512)
self.pool4 = nn.MaxPool2d(2)
self.conv5 = DoubleConv(512, 1024)
# 逆卷积,也可以使用上采样(保证k=stride,stride即上采样倍数)
self.up6 = nn.ConvTranspose2d(1024, 512, 2, stride=2)
self.conv6 = DoubleConv(1024, 512)
self.up7 = nn.ConvTranspose2d(512, 256, 2, stride=2)
self.conv7 = DoubleConv(512, 256)
self.up8 = nn.ConvTranspose2d(256, 128, 2, stride=2)
self.conv8 = DoubleConv(256, 128)
self.up9 = nn.ConvTranspose2d(128, 64, 2, stride=2)
self.conv9 = DoubleConv(128, 64)
self.conv10 = nn.Conv2d(64, out_ch, 1)
def forward(self, x):
c1 = self.conv1(x)
p1 = self.pool1(c1)
c2 = self.conv2(p1)
p2 = self.pool2(c2)
c3 = self.conv3(p2)
p3 = self.pool3(c3)
c4 = self.conv4(p3)
p4 = self.pool4(c4)
c5 = self.conv5(p4)
up_6 = self.up6(c5)
merge6 = torch.cat([up_6, c4], dim=1)
c6 = self.conv6(merge6)
up_7 = self.up7(c6)
merge7 = torch.cat([up_7, c3], dim=1)
c7 = self.conv7(merge7)
up_8 = self.up8(c7)
merge8 = torch.cat([up_8, c2], dim=1)
c8 = self.conv8(merge8)
up_9 = self.up9(c8)
merge9 = torch.cat([up_9, c1], dim=1)
c9 = self.conv9(merge9)
c10 = self.conv10(c9)
out = nn.Sigmoid()(c10)
return out
逆卷积理解nn.ConvTranspose2d
函数原型
torch.nn.ConvTranspose2d(in_channels, out_channels,
kernel_size, stride=1, padding=0, output_padding=0,
groups=1, bias=True, dilation=1, padding_mode='zeros',
device=None, dtype=None)
描述
对由多个输入平面(通道)组成的输入图像应用二维转置卷积算子。这个模块可以看作Conv2d相对于其输入的梯度。它也被称为分步卷积或反卷积(尽管它不是实际的反卷积操作)。想要了解逆卷积,需要从常规卷积入手。
常规卷积的操作
函数原型:
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1,
groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
- 输入: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
- 输出: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)
其中:
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
⌋
H_{out} = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor
Hout=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋
变形为:
H
o
u
t
=
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
−
α
H_{out} = \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) - 1 }{stride[0]} + 1 - \alpha
Hout=stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1−α
其中
α
\alpha
α表示向下取整减去的小数部分,
α
∈
[
0
,
1
)
\alpha \in [0,1)
α∈[0,1),且
α
×
s
t
r
i
d
e
[
0
]
∈
Z
+
\alpha \times stride[0] \in \bold{Z}^{+}
α×stride[0]∈Z+
则:
H
i
n
=
(
H
o
u
t
−
1
)
×
s
t
r
i
d
e
l
[
0
]
−
2
×
p
a
d
d
i
n
g
[
0
]
+
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
+
1
+
α
×
s
t
r
i
d
e
[
0
]
H_{in} = ( H_{out} - 1 ) \times stridel[0] - 2 \times padding[0] + dilation[0] \times (kernel\_size[0] - 1 ) + 1 + \alpha \times stride[0]
Hin=(Hout−1)×stridel[0]−2×padding[0]+dilation[0]×(kernel_size[0]−1)+1+α×stride[0]
上面的公式就是表示由常规卷积的输出尺寸反求输入尺寸,而逆卷积也就是由常规卷积的输出结果反求输入结果。
- 输入: ( N , C i n , H i n , W i n ) (N,C_{in},H_{in},W_{in}) (N,Cin,Hin,Win)
- 输出: ( N , C o u t , H o u t , W o u t ) (N,C_{out},H_{out},W_{out}) (N,Cout,Hout,Wout)
其中:
H
o
u
t
=
(
H
i
n
−
1
)
×
s
t
r
i
d
e
l
[
0
]
−
2
×
p
a
d
d
i
n
g
[
0
]
+
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
+
1
+
o
u
t
p
u
t
_
p
a
d
d
i
n
g
[
0
]
W
o
u
t
=
(
W
i
n
−
1
)
×
s
t
r
i
d
e
l
[
1
]
−
2
×
p
a
d
d
i
n
g
[
1
]
+
d
i
l
a
t
i
o
n
[
1
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
1
]
−
1
)
+
1
+
o
u
t
p
u
t
_
p
a
d
d
i
n
g
[
1
]
H_{out} = ( H_{in} - 1 ) \times stridel[0] - 2 \times padding[0] + dilation[0] \times (kernel\_size[0] - 1 ) + 1 + output\_padding[0] \\ W_{out} = ( W_{in} - 1 ) \times stridel[1] - 2 \times padding[1] + dilation[1] \times (kernel\_size[1] - 1 ) + 1 + output\_padding[1]
Hout=(Hin−1)×stridel[0]−2×padding[0]+dilation[0]×(kernel_size[0]−1)+1+output_padding[0]Wout=(Win−1)×stridel[1]−2×padding[1]+dilation[1]×(kernel_size[1]−1)+1+output_padding[1]
仅仅是将 α × s t r i d e [ 0 ] \alpha \times stride[0] α×stride[0]换成 o u t p u t _ p a d d i n g output\_padding output_padding而已,也正好说明了 o u t p u t _ p a d d i n g output\_padding output_padding的来源:常规卷积有时会无法利用所有原始信息,有边角料未利用,如:input=5x5,padding=0,kernel_size=2x2,stride=2这种情况,为了产生相同尺寸的特征,那么在逆卷积对应回去的时候,通过 o u t p u t _ p a d d i n g output\_padding output_padding解决多尺寸的问题。当然如果卷积的时候取 s t r i d e [ 0 ] = 1 stride[0]=1 stride[0]=1就没有 o u t p u t _ p a d d i n g output\_padding output_padding的烦恼了!
对比卷积公式,对逆卷积公式变形( d i l a t i o n = 1 dilation=1 dilation=1)
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
⌋
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
[
0
]
s
t
r
i
d
e
[
0
]
+
1
⌋
\begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times padding[0] - kernel\_size[0] }{ stride[0] } + 1 \rfloor \\ \end{aligned}
Hout=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋=⌊stride[0]Hin+2×padding[0]−kernel_size[0]+1⌋
H
o
u
t
=
(
H
i
n
−
1
)
×
s
t
r
i
d
e
l
[
0
]
−
2
×
p
a
d
d
i
n
g
[
0
]
+
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
+
1
+
o
u
t
p
u
t
_
p
a
d
d
i
n
g
[
0
]
=
H
i
n
×
s
t
r
i
d
e
l
[
0
]
−
s
t
r
i
d
e
l
[
0
]
−
2
×
p
a
d
d
i
n
g
[
0
]
+
k
e
r
n
e
l
_
s
i
z
e
[
0
]
+
o
u
t
p
u
t
_
p
a
d
d
i
n
g
[
0
]
=
⌊
H
i
n
+
(
H
i
n
−
1
)
×
(
s
t
r
i
d
e
l
[
0
]
−
1
)
+
2
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
p
a
d
d
i
n
g
[
0
]
−
1
)
−
k
e
r
n
e
l
_
s
i
z
e
[
0
]
1
+
1
⌋
+
o
u
t
p
u
t
_
p
a
d
d
i
n
g
[
0
]
\begin{aligned} H_{out} & = ( H_{in} - 1 ) \times stridel[0] - 2 \times padding[0] + dilation[0] \times (kernel\_size[0] - 1 ) + 1 + output\_padding[0] \\ & = H_{in} \times stridel[0] - stridel[0] - 2 \times padding[0] + kernel\_size[0] + output\_padding[0] \\ & = \lfloor \frac{ H_{in} + ( H_{in} - 1 ) \times ( stridel[0] - 1) + 2 \times ( kernel\_size[0] - padding[0] -1 ) - kernel\_size[0] }{ 1 } + 1 \rfloor + output\_padding[0] \\ \end{aligned}
Hout=(Hin−1)×stridel[0]−2×padding[0]+dilation[0]×(kernel_size[0]−1)+1+output_padding[0]=Hin×stridel[0]−stridel[0]−2×padding[0]+kernel_size[0]+output_padding[0]=⌊1Hin+(Hin−1)×(stridel[0]−1)+2×(kernel_size[0]−padding[0]−1)−kernel_size[0]+1⌋+output_padding[0]
由上面的公式可以看出,逆卷积相当于是对一个新的特征进行常规卷积,
H
i
n
+
(
H
i
n
−
1
)
×
(
s
t
r
i
d
e
l
[
0
]
−
1
)
H_{in} + ( H_{in} - 1 ) \times ( stridel[0] - 1)
Hin+(Hin−1)×(stridel[0]−1)表示在高度方向的每两相邻行(列)中间插入
s
t
r
i
d
e
l
[
0
]
−
1
stridel[0] - 1
stridel[0]−1行(列)零,然后对新特征的进行
k
e
r
n
e
l
_
s
i
z
e
[
0
]
kernel\_size[0]
kernel_size[0]不变,
s
t
r
i
d
e
n
e
w
[
0
]
=
1
stride_{new}[0] =1
stridenew[0]=1,
p
a
d
d
i
n
g
n
e
w
[
0
]
=
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
p
a
d
d
i
n
g
[
0
]
−
1
padding_{new}[0] =kernel\_size[0] - padding[0] -1
paddingnew[0]=kernel_size[0]−padding[0]−1的常规卷积,再加上
o
u
t
p
u
t
_
p
a
d
d
i
n
g
[
0
]
output\_padding[0]
output_padding[0]得到结果,实际上也是这么操作的,如下图所示。
逆卷积操作步骤
- 第一步:对输入的特征图a进行一些变换,得到新的特征图
- 第二步:求新的卷积核设置,得到新的卷积核设置
- 第三步:用新的卷积核在新的特征图上做常规的卷积,得到的结果就是逆卷积的结果,就是我们要求的结果。
也就是说最后还是通过常规的卷积得到结果。
对上文Unet
网络中逆卷积参数的解释
上文中的DoubleConv
是双层卷积,定义如下:
self.conv = nn.Sequential(
nn.Conv2d(in_ch, out_ch, 3, padding=1),
nn.BatchNorm2d(out_ch), #添加了BN层
nn.ReLU(inplace=True),
nn.Conv2d(out_ch, out_ch, 3, padding=1),
nn.BatchNorm2d(out_ch),
nn.ReLU(inplace=True)
)
其中nn.BatchNorm2d
和nn.ReLU
不改变数据形状。由卷积公式:
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
⌋
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
⌋
=
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
[
0
]
+
1
\begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor \\ & = \lfloor H_{in} + 2 \times padding[0] - (kernel\_size[0] - 1 ) \rfloor \\ & = H_{in} + 2 \times padding[0] - kernel\_size[0] + 1 \\ \end{aligned}
Hout=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋=⌊Hin+2×padding[0]−(kernel_size[0]−1)⌋=Hin+2×padding[0]−kernel_size[0]+1
所以两层卷积的结果就是:
H
o
u
t
=
(
H
i
n
+
2
×
p
a
d
d
i
n
g
1
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
1
[
0
]
+
1
)
+
2
×
p
a
d
d
i
n
g
2
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
2
[
0
]
+
1
=
H
i
n
+
2
×
(
p
a
d
d
i
n
g
1
[
0
]
+
p
a
d
d
i
n
g
2
[
0
]
)
−
k
e
r
n
e
l
_
s
i
z
e
1
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
2
[
0
]
+
2
\begin{aligned} H_{out} & = (H_{in} + 2 \times padding_1[0] - kernel\_size_1[0] + 1 ) + 2 \times padding_2[0] - kernel\_size_2[0] + 1 \\ & = H_{in} + 2 \times (padding_1[0] + padding_2[0] ) - kernel\_size_1[0] - kernel\_size_2[0] + 2 \\ \end{aligned}
Hout=(Hin+2×padding1[0]−kernel_size1[0]+1)+2×padding2[0]−kernel_size2[0]+1=Hin+2×(padding1[0]+padding2[0])−kernel_size1[0]−kernel_size2[0]+2
池化层(形状变换相当于卷积)原型:
torch.nn.MaxPool2d(kernel_size, stride=kernel_size, padding=0,
dilation=1, return_indices=False,
ceil_mode=False)
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
k
e
r
n
e
l
_
s
i
z
e
[
0
]
+
1
⌋
=
⌊
H
i
n
−
k
e
r
n
e
l
_
s
i
z
e
3
[
0
]
k
e
r
n
e
l
_
s
i
z
e
3
[
0
]
+
1
⌋
=
⌊
H
i
n
k
e
r
n
e
l
_
s
i
z
e
3
[
0
]
⌋
\begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{ kernel\_size[0] } + 1 \rfloor \\ & = \lfloor \frac{ H_{in} - kernel\_size_3[0] }{ kernel\_size_3[0] } + 1 \rfloor \\ & = \lfloor \frac{ H_{in} }{ kernel\_size_3[0] } \rfloor \\ \end{aligned}
Hout=⌊kernel_size[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋=⌊kernel_size3[0]Hin−kernel_size3[0]+1⌋=⌊kernel_size3[0]Hin⌋
所以结果为:
H
o
u
t
=
⌊
H
i
n
+
2
×
(
p
a
d
d
i
n
g
1
[
0
]
+
p
a
d
d
i
n
g
2
[
0
]
)
−
k
e
r
n
e
l
_
s
i
z
e
1
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
2
[
0
]
+
2
k
e
r
n
e
l
_
s
i
z
e
3
[
0
]
⌋
=
⌊
H
i
n
+
2
×
(
p
a
d
d
i
n
g
1
[
0
]
+
p
a
d
d
i
n
g
2
[
0
]
)
−
k
e
r
n
e
l
_
s
i
z
e
1
[
0
]
−
k
e
r
n
e
l
_
s
i
z
e
2
[
0
]
+
2
k
e
r
n
e
l
_
s
i
z
e
3
[
0
]
⌋
=
⌊
H
i
n
+
2
×
(
2
)
−
6
+
2
2
⌋
=
⌊
H
i
n
−
2
2
+
1
⌋
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
⌋
=
⌊
H
i
n
−
k
e
r
n
e
l
_
s
i
z
e
[
0
]
s
t
r
i
d
e
[
0
]
+
1
⌋
\begin{aligned} H_{out} & = \lfloor \frac{ H_{in} + 2 \times (padding_1[0] + padding_2[0] ) - kernel\_size_1[0] - kernel\_size_2[0] + 2 }{ kernel\_size_3[0] } \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times (padding_1[0] + padding_2[0] ) - kernel\_size_1[0] - kernel\_size_2[0] + 2 }{ kernel\_size_3[0] } \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times ( 2 ) - 6 + 2 }{ 2 } \rfloor \\ & = \lfloor \frac{ H_{in} - 2 }{ 2 } + 1 \rfloor \\ & = \lfloor \frac{ H_{in} + 2 \times padding[0] - dilation[0] \times (kernel\_size[0] - 1 ) -1 }{stride[0]} + 1 \rfloor \\ & = \lfloor \frac{ H_{in} - kernel\_size[0] }{ stride[0] } + 1 \rfloor \\ \end{aligned}
Hout=⌊kernel_size3[0]Hin+2×(padding1[0]+padding2[0])−kernel_size1[0]−kernel_size2[0]+2⌋=⌊kernel_size3[0]Hin+2×(padding1[0]+padding2[0])−kernel_size1[0]−kernel_size2[0]+2⌋=⌊2Hin+2×(2)−6+2⌋=⌊2Hin−2+1⌋=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋=⌊stride[0]Hin−kernel_size[0]+1⌋
所以
s
t
r
i
d
e
=
2
,
k
e
r
n
e
r
_
s
i
z
e
=
2
stride=2,kerner\_size=2
stride=2,kerner_size=2。当然也可以有其他组合,这里为了方便令
p
a
d
d
i
n
g
=
0
,
d
i
l
a
t
i
o
n
=
1
padding=0,dilation=1
padding=0,dilation=1,通道数的计算简单,这里就不列了。