every blog every motto: Stay hungry, stay foolish.
0. 前言
主要讲下实际卷积运算中关于padding=same和padding=valid的输出特征图的大小,以及池化后特征图的大小。
1. 正文
1. 卷积运算
特别说明:卷积(除不尽)向下取整!!!!
特别说明:卷积(除不尽)向下取整!!!!
特别说明:卷积(除不尽)向下取整!!!!
关于卷积的基本运算参照卷积运算和运算后特征图大小计算1
参数定义:
输入大小:intputH
卷积核大小:K
步长:S
填充:P
输出图像大小:outputH
1.1 padding=valid
不进行填充,P=0,输入特征图大小为:
o
u
t
p
u
t
H
=
i
n
t
p
u
t
H
−
K
+
2
∗
0
S
+
1
outputH = \frac{intputH-K+2*0}{S}+1
outputH=SintputH−K+2∗0+1
即:
o
u
t
p
u
t
H
=
i
n
t
p
u
t
H
−
K
S
+
1
outputH = \frac{intputH-K}{S}+1
outputH=SintputH−K+1
卷积(除不尽)向下取整
1.2 padding=same
依据卷积核的大小,填充不同
K=1,P=0;
K=3,P=1;
K=5,P=3;
依次类推。
o
u
t
p
u
t
H
=
i
n
t
p
u
t
H
−
K
+
2
P
S
+
1
outputH = \frac{intputH-K+2P}{S}+1
outputH=SintputH−K+2P+1
卷积(除不尽)向下取整
2. 池化
特别说明:池化(除不尽)向上取整!!!!
特别说明:池化(除不尽)向上取整!!!!
特别说明:池化(除不尽)向上取整!!!!
池化没有填充,公式如下:
o
u
t
p
u
t
H
=
i
n
t
p
u
t
H
−
K
S
+
1
outputH = \frac{intputH-K}{S}+1
outputH=SintputH−K+1
3. 练习
VGG部分代码,基于tensorflow1.x
3.1 练习一
# 416,416,3 -> 208,208,64
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
y = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
z = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
f1 = z
说明: Keras中Conv2D默认步长为1,具体参数可参考文献1
代入公式:
第一个卷积:
X
o
u
t
p
u
t
H
=
416
−
3
+
2
1
+
1
=
416
XoutputH = \frac{416-3+2}{1}+1=416
XoutputH=1416−3+2+1=416
第二个卷积:
Y
o
u
t
p
u
t
H
=
416
−
3
+
2
1
+
1
=
416
YoutputH = \frac{416-3+2}{1}+1=416
YoutputH=1416−3+2+1=416
池化:
Z
o
u
t
p
u
t
H
=
416
−
2
2
+
1
=
208
ZoutputH = \frac{416-2}{2}+1=208
ZoutputH=2416−2+1=208
至此: 输入出现由416 * 416 * 3 ⇒ 208 * 208 * 64 ( 64为卷积核个数,即Conv2D的第一个参数,具体可参考文献2
3.2 练习二
# 208,208,64 -> 104,104,128
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
y = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
z = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
f2 = z
说明: Keras中Conv2D默认步长为1,具体参数可参考文献1
代入公式:
第一个卷积:
X
o
u
t
p
u
t
H
=
208
−
3
+
2
1
+
1
=
208
XoutputH = \frac{208-3+2}{1}+1=208
XoutputH=1208−3+2+1=208
第二个卷积:
Y
o
u
t
p
u
t
H
=
208
−
3
+
2
1
+
1
=
208
YoutputH = \frac{208-3+2}{1}+1=208
YoutputH=1208−3+2+1=208
池化:
Z
o
u
t
p
u
t
H
=
208
−
2
2
+
1
=
104
ZoutputH = \frac{208-2}{2}+1=104
ZoutputH=2208−2+1=104
至此: 输入出现由208* 208 * 3 ⇒ 104 * 104 * 128 ( 128为卷积核个数,即Conv2D的第一个参数,具体可参考文献2
3.3 小结
由上例,可以发现,VGG的这种结构(卷积核 3 * 3 ,padding=same;池化 2 * 2,strides=(2,2))会让输出特征图长宽变成输入特征图长宽的一半。输出特征图通道数由卷积核个数决定!!!!
写在最后:
卷积(除不尽)向下取整,池化(除不尽)向上取整。
卷积(除不尽)向下取整,池化(除不尽)向上取整。
卷积(除不尽)向下取整,池化(除不尽)向上取整。
卷积(除不尽)向下取整,池化(除不尽)向上取整。
参考文献
[1] https://blog.csdn.net/econe_wei/article/details/94649003
[2] https://blog.csdn.net/weixin_39190382/article/details/105692853
[3] https://blog.csdn.net/weixin_38705903/article/details/89073938?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-1&utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-1
[4] https://blog.csdn.net/bohuihuan8714/article/details/89894124?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-3&utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-3
[5] https://blog.csdn.net/AugustMe/article/details/92096724?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-5&utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-5
[6] https://keras-cn.readthedocs.io/en/latest/layers/pooling_layer/#maxpooling2d
[7] https://keras-cn.readthedocs.io/en/latest/layers/convolutional_layer/