Resnet
一、残差网络t介绍
前言
ResNet(Residual Neural Network)是由何凯明等四位华人提出,下面是Resnet的论文地址。
paper链接: paper:http://xxx.itp.ac.cn/pdf/1512.03385.pdf.
1、思想
将靠前若干层的某一层数据输出直接跳过多层引入到后面数据层的输入部分,因此,后面的特征层的内容会有一部分由其前面的某一层线性贡献。
2、作用
克服由于网络深度加深而产生的学习效率变低与准确率无法有效提升的问题。
3、结构图
二、Resnet网络结构
1、基本模块
Resnet基本模块有两个,一个是Identity Block,另一个是Convolution Block。
1.1、Identity Block
当输入维度和输出维度相同,可以串联,直接相加,用于加深网络
1.2、Convolution Block
输入和输出的维度不一样时,不能连续串联,采用convolution_block模块,其作用为改变网络的维度,然后再进行串联(对应的位置,格式完全匹配),加深网络。
1.3、注意点
值得注意的是,需要区分阶段和层的关系,每一个阶段包含许多层,阶段是接连的有相同大小图片组成的,通道数可以不一样,但长宽一定一样,简单来说,如果一个224×224×3的图片通过卷积后变成224×5特征图,这两个图片属于同一阶段,因为他们长宽没有发生变化,但是通道数发生了变化。Resnet针对的是同一阶段内的图片进行特征融合,两个基本模块Identity Block和Convolution Block是为了解决不同情况的,因为需要相加,所以相加的两个特征图的大小和通道数都得一致,如果已经相同了,就可以直接相加,否则,需要改变通道数后再相加,改变通道数的方法就是1×1卷积了。
1.4、1×1卷积介绍
1x1卷积,输入层尺寸为HWD,滤波器大小为11D,输出的通道的尺寸为HW1.有n个滤波器组合就成了HWn,这就改变了图像的通道数,如下图示意。
三、Resnet代码解读
1、代码
1.1、整体架构
def resnet_graph(input_image, architecture, stage5=False):
assert architecture in ["resnet50", "resnet101"]
# Stage 1
x = KL.ZeroPadding2D((3, 3))(input_image) #进行zeropadding。
x = KL.Conv2D(64, (7, 7), strides=(2, 2), name='conv1', use_bias=True)(x)#7*7卷积,比较快速压缩图片。
x = BatchNorm(axis=3, name='bn_conv1')(x)
x = KL.Activation('relu')(x) #relu层
C1 = x = KL.MaxPooling2D((3, 3), strides=(2, 2), padding="same")(x) #最大池化
# Stage 2
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
C2 = x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
# Stage 3
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
C3 = x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
# Stage 4
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
block_count = {"resnet50": 5, "resnet101": 22}[architecture]
for i in range(block_count):
x = identity_block(x, 3, [256, 256, 1024], stage=4, block=chr(98 + i))
C4 = x
# Stage 5
if stage5:
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
C5 = x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
else:
C5 = None
return [C1, C2, C3, C4, C5]
1.2、convolution block代码
def conv_block(input_tensor, kernel_size, filters, stage, block,
strides=(2, 2), use_bias=True):
nb_filter1, nb_filter2, nb_filter3 = filters
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = KL.Conv2D(nb_filter1, (1, 1), strides=strides,
name=conv_name_base + '2a', use_bias=use_bias)(input_tensor)
x = BatchNorm(axis=3, name=bn_name_base + '2a')(x)
x = KL.Activation('relu')(x)
x = KL.Conv2D(nb_filter2, (kernel_size, kernel_size), padding='same',
name=conv_name_base + '2b', use_bias=use_bias)(x)
x = BatchNorm(axis=3, name=bn_name_base + '2b')(x)
x = KL.Activation('relu')(x)
x = KL.Conv2D(nb_filter3, (1, 1), name=conv_name_base +
'2c', use_bias=use_bias)(x)
x = BatchNorm(axis=3, name=bn_name_base + '2c')(x)
shortcut = KL.Conv2D(nb_filter3, (1, 1), strides=strides,
name=conv_name_base + '1', use_bias=use_bias)(input_tensor)
shortcut = BatchNorm(axis=3, name=bn_name_base + '1')(shortcut)
x = KL.Add()([x, shortcut])
x = KL.Activation('relu', name='res' + str(stage) + block + '_out')(x)
return x
1.3、identity block代码
def identity_block(input_tensor, kernel_size, filters, stage, block,
use_bias=True):
nb_filter1, nb_filter2, nb_filter3 = filters
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = KL.Conv2D(nb_filter1, (1, 1), name=conv_name_base + '2a',
use_bias=use_bias)(input_tensor)
x = BatchNorm(axis=3, name=bn_name_base + '2a')(x)
x = KL.Activation('relu')(x)
x = KL.Conv2D(nb_filter2, (kernel_size, kernel_size), padding='same',
name=conv_name_base + '2b', use_bias=use_bias)(x)
x = BatchNorm(axis=3, name=bn_name_base + '2b')(x)
x = KL.Activation('relu')(x)
x = KL.Conv2D(nb_filter3, (1, 1), name=conv_name_base + '2c',
use_bias=use_bias)(x)
x = BatchNorm(axis=3, name=bn_name_base + '2c')(x)
x = KL.Add()([x, input_tensor])
x = KL.Activation('relu', name='res' + str(stage) + block + '_out')(x)
return x
2、解读
最常用的是Resnet50和Resnet101,后面数字代表的是层数,就是简单的卷积,激活,池化,全连接的堆叠,中间过程引入了Identity Block和Convolution Block两个模块。代码还是比较好理解的。