ResNet网络结构和主要代码解析

最新推荐文章于 2024-05-10 19:16:49 发布

fxfviolet

最新推荐文章于 2024-05-10 19:16:49 发布

阅读量8.5k

点赞数 5

文章标签：深度学习 TensorFlow ResNet

本文链接：https://blog.csdn.net/fxfviolet/article/details/81557329

版权

学习了ResNet卷积神经网络，总结一下对ResNet网络结构和主要代码的理解。

ResNet（Residual Neural Network）通过使用残差学习单元（Residual Unit），训练了152层深的神经网络，在ILSVRC 2015比赛中取得3.57%的top-5错误率。ResNet与其他卷积神经网络的不同之处在于采用残差结抅，原始输入信息可以直接传输到后面的输出层中，即前面一层的信息，有一定比例可以不经过矩阵乘法和非线性变换，直接传输到下一层，这种方法解决了极深网络难以训练的问题。

1 ResNet残差学习模块

ResNet的残差学习模块如下图所示，具体的方法为：假定某段神经网络的输入是x，期望输出是H(x)，如果直接把输入x传到输出作为初始结果，此时需要学习的目标就是F(x)=H(x)-x，这就是一个ResNet的残差学习单元。ResNet相当于将学习目标改变了，不再是学习一个完整的输出H(x)，只是输出和输入的差别H(x)-x。传统的卷积层或全连接层在信息传递时，或多或少会存在信息丢失、损耗等问题。ResNet在某种程度上解决了这个问题，通过直接将输入信息绕道传到输出，保护信息的完整性，整个网络则只需要学习输入、输出差别的那一部分，简化学习目标和难度。

ResNet代码中最核心的残差学习模块函数bottleneck如下。

1）bottleneck函数的输入参数中，inputs是输入，depth是输出通道数，depth_bottleneck是残差的第1、2层输出通道数，stride是步长。

2）首先用 slim.utils.last_dimension获取输入的最后一个维度，即输入通道数。接着用slim.batch_norm对输入进行归一化(Batch Normalization)，并使用ReLU函数预激活。然后定义shortcut，即作为支线的输入x，如果输入通道数depth_in和输出通道数depth一致，使用subsample按步长为stride对inputs进行空间上的降采样。如果不一致，用步长为stride的1x1卷积改变其通道数，使得输出通道数为depth。然后定义residual，即残差，共有3层，先是1个【1x1】卷积，再是1个【3x3】卷积，这两个卷积的输出通道数都是depth_bottleneck，最后是1个【1x1】、输出通道数为depth的卷积，得到最终的residual。

3）最后将shortcut（输入x）和residual（F(x)）相加，得到output（输出H(x)），使用slim.utils.collect_named_outputs将结果添加进collection并返回output作为函数结果。

def bottleneck(inputs, depth, depth_bottleneck, stride, outputs_collections = None, scope = None):
    
    with tf.variable_scope(scope, 'bottleneck_v2', [inputs]) as sc:
        depth_in = slim.utils.last_dimension(inputs.get_shape(), min_rank = 4)
        preact = slim.batch_norm(inputs, activation_fn = tf.nn.relu, scope = 'preact')

        if depth == depth_in:
            shortcut = subsample(inputs, stride, 'shortcut')
        else:
            shortcut = slim.conv2d(preact, depth, [1, 1], stride = stride, normalizer_fn = None, activation_fn = None, scope = 'shortcut')
        
        residual = slim.conv2d(preact, depth_bottleneck, [1, 1], stride = 1, scope = 'conv1')
        residual = conv2d_same(residual, depth_bottleneck, 3, stride, scope = 'conv2')
        residual = slim.conv2d(residual, depth, [1, 1], stride = 1, normalizer_fn = None, activation_fn = None, scope = 'conv3')

        output = shortcut + residual

        return slim.utils.collect_named_outputs(outputs_collections, sc.name, output)

2 152层的ResNet网络配置

152层的ResNet有4个残差学习的Block，每个Block的units数量分别为3、8、36和3，总层数为（3+8+36+3）x3+2=152。由Block组成的blocks模块组的代码如下。

对于每个Block，比如第一个Block的‘’block1’，是这个Block的名称（或scope），bottleneck就是上面的残差学习单元，参数[(256, 64, 1)] x 2 + [(256, 64, 2)]是这个Block的列表，其中每个元素都对应一个bottleneck残差学习单元，前面两个元素都是(256, 64, 1)]，最后一个是(256, 64, 2)。每个元素都是一个三元数组，即（depth，depth_bottleneck，stride）。比如对于元素(256,64,2)构建的bottleneck残差学习单元，每个残差学习单元包含三个卷积层，第一、二个卷积层的输出通道数（depth_bottleneck）为64，第一个卷积层的步长为1，第二个卷积层的步长stride为2，第三层输出通道数（depth）为256，步长为1。在这个Block中，一共有3个bottleneck残差学习单元。其余三个Block中的残差学习单元分别是8个、36个、3个。

def resnet_v2_152(inputs, num_classes = None, global_pool = True, reuse = None, scope = 'resnet_v2_152'):
    blocks = [
        Block('block1', bottleneck, [(256, 64, 1)] * 2 + [(256, 64, 2)]),
        Block('block2', bottleneck, [(512, 128, 1)] * 7 + [(512, 128, 2)]),
        Block('block3', bottleneck, [(1024, 256, 1)] * 35 + [(1024, 256, 2)]),
        Block('block4', bottleneck, [(2048, 512, 1)] * 3)]
    return resnet_v2(inputs, blocks, num_classes, global_pool, include_root_block = True, reuse = reuse, scope = scope)

3 堆叠残差学习模块

定义堆叠Block的函数stack_blocks_dense。输入参数中，net是输入，blocks是上面由4个残差学习Block组成的列表。使用两层循环，逐个Block，逐个残差单元地堆叠。第一层循环中，使用tf.variable_scope将残差学习单元命名为block1/unit_1的形式。第二层循环中，对每个block.args，展开为（depth，depth_bottleneck，stride），然后用unit_fn函数顺序地创建并连接所有的残差学习单元。当所有Block中的残差单元堆叠完之后，返回net作为输出。

def stack_blocks_dense(net, blocks, outputs_collections = None):
    for block in blocks:
        with tf.variable_scope(block.scope, 'block', [net]) as sc:
            for i, unit in enumerate(block.args):
                with tf.variable_scope('unit_%d' %(i + 1), values = [net]):
                    unit_depth, unit_depth_bottleneck, unit_stride = unit 
                    net = block.unit_fn(net, depth = unit_depth,
                                    unit_depth_bottleneck = unit_depth_bottleneck,
                                    stride = unit_stride)
                    net = slim.utils.collect_named_outputs(outputs_collections, sc.name, net)
    return net

4 ResNet主函数

根据上面定义好的网络的残差学习模块组，生成完整的ResNet网络结构。输入参数中，inputs是输入，blocks是定义好的Block类的列表，num_classes是最后输出的类数，global_pool是最后一层是否加上全局平均池化。如果include_root_block为True，首先创建输出通道数为64、步长为2的【7x7】卷积，后面接一个步长为2的【3x3】最大池化，此时图片尺寸缩减为输入图片的1/4。然后根据上面定义的stack_blocks_dense堆叠残差学习单元，再用tf.reduce_mean实现全局平均池化，根据分类数，添加一个输出通道数为num_classes的【1x1】卷积。最后用Softmax得到最终分类结果。

def resnet_v2(inputs, blocks, num_classes = None, global_pool = True, include_root_block = True, reuse = None, scope = None):
    
    with tf.variable_scope(scope, 'resnet_v2', [inputs], reuse = reuse) as sc:
        end_points_collection = sc.original_name_scope + '_end_points'
        
        with slim.arg_scope([slim.conv2d, bottleneck, stack_blocks_dense], outputs_collections = end_points_collection):
            net = inputs
            if include_root_block:
                with slim.arg_scope([slim.conv2d], activation_fn = None, normalizer_fn = None):
                    net = conv2d_same(net, 64, 7, stride = 2, scope = 'conv1')
                net = slim.max_pool2d(net, [3, 3], stride = 2, scope = 'pool1')
            net = stack_blocks_dense(net, blocks)
            net = slim.batch_norm(net, activation_fn = tf.nn.relu, scope = 'postnorm')

            if global_pool:
                net = tf.reduce_mean(net, [1, 2], name = 'pool5', keep_dims = True)

            if num_classes is not None:
                net = slim.conv2d(net, num_classes, [1, 1], activation_fn = None, normalizer_fn = None, scope = 'logits')
                end_points = slim.utils.convert_collection_to_dict(end_points_collection)

            if num_classes is not None:
                    end_points['predictions'] = slim.softmax(net, scope = 'predictions')

            return net, end_points

参考文献：

1. 《TensorFlow实战》

fxfviolet

关注

5
点赞
踩
32

收藏

觉得还不错? 一键收藏
1
评论
ResNet网络结构和主要代码解析

学习了ResNet卷积神经网络，总结一下对ResNet网络结构和主要代码的理解。 ResNet（Residual Neural Network）通过使用残差学习单元（Residual Unit），训练了152层深的神经网络，在ILSVRC 2015比赛中取得3.57%的top-5错误率。ResNet与其他卷积神经网络的不同之处在于采用残差结抅，原始输入信息可以直接传输...
复制链接

扫一扫