Tensorflow实现Google Incepion Net及其原理

 一、Google Incepion Net首次出现在ILSVRC 2014的比赛中,就以较大的优势取得第一名,在这场比赛中Google Incepion Net称为Inception V1,他最大的特点是控制了计算量和参数量的同时获得了非常好的分类性能——top-5错误率6.67%,是AlexNet的一半不到。Inception V1有22层深,比AlexNet有8层和VGGNet有19层还要深,但是其计算量仅有15亿次浮点计算,仅为AlexNet的1/12,却可以达到远胜于AlexNet的准确率。可以说很优秀了。。 。

 二、Inception V1参数少但效果好的原因除了模型层数更深,表达能力更强外还有两点:

(1)去除了最后一层的全连接层,用全局平均池化层(即将图片尺寸变为1*1)来取代它,全连接层几乎占据AlexNet和VGGNet中99%的参数量,会引起过拟合,去除全连接层后模型训练更快并且减轻了过拟合。

(2)Inception V1中精心设计的Inception Module提高了参数的利用率,结构如图1。也借鉴了NIN的思想,不过Inception V1比NIN更进一步的是增加了分支网络。

Inception Module的基本结构有4个分支,第一分支是进行1*1的卷积,这也是NIN中提出的重要结构。1*1的卷积是非常优秀的结构,可以跨通道组织信息,提高网络的表达能力,同时可以对输出通道降维和升维。从图1中我们可以看出4个分支都用到了1*1的卷积结构,来降低成本。第二分支先使用1*1,在使用3*3,相当于两次特征变换;第三分支先1*1,在5*5,最后分支3*3最大池化后直接使用1*1。Inception Module的四个分支在最后进行聚合操作合并。Inception Module包含3种不同尺寸的卷积核一个最大池化,增加了网络对不同尺度的适应性。

                                          图1:Inception Module结构图 

Inception Net的主要目标是找到最优的稀疏结构单元,其结构基于Hebbian,即一起发射的神经元会连在一起。如图2所示,相关性高的节点应该连在一起的结论,就是从神经网络的角度对Hebbian原理有效的证明。1*1的卷进很自然地将这些相关性很高的,在同一空间位置但不同通道的特征连接在一起,这就是为什么1*1的卷积被频繁用到Inception Net中的原因。

图2:将高度相关的节点连在一起,形成稀疏网络

在整个网络中,会有很多个堆叠的Inception Module,我们希望靠后的Inception Module可以捕捉更高阶的抽象特征,

所以靠后的Inception Module的卷积空间集中度应该逐渐减低, 可以捕捉更大面积的特征, 越靠后的Inception Module中,3*3和5*5的卷积核占比更大。中间节点还用到了辅助分类节点, 同时给网络增加了反向传播的梯度信号, 也提供了额外的正则化。

三、除了上述的 Inception V1,  Google Incepion Net还是一个大家族,还包括:

Inception V2:top-5错误率4.8%;

Inception V3:top-5错误率3.5%;

Inception V4:top-5错误率3.08%。

Inception V2学习了VGGNet,用两个3*3的卷积代替5*5的大卷积,还提出了著名的BN方法,BN是一个非常有效的正则化方法,

可以让大型卷积网络的速度加快很多倍, 同时收敛后的分类准确率也可以得到大幅提高。 还有其他相应的调整, 就不说了, ,

Inception V3网络主要有两方面的改造, 一是引入Factorization into small convolutions的思想, 讲一个较大的二维卷积拆成两个较小的一维卷积, 比如将7*7的拆成1*7和7*1,或者将3*3的拆成1*3和3*1的,如图3所示, 节约了大量参数, 加速运算并减轻过拟合, 同时增加了一层非线性扩展模型表达能力。 另一方面是Inception V3优化了Inception Module的结构,现在的Inception Module有35*35、17*17、8*8三种不同结构,Inception V3在分支中还使用了分支。 如图4所示。而Inception V4主要结合了微软的ResNet,后面会介绍到。

图3:3*3的卷积拆成1*3卷积和3*1卷积

                                   图4:Inception V3中三种结构的Inception Module            

四、 接下来将实现Inception V3, 其整个网络结构如表6-1所示

设计inception net的重要原则是图片尺寸不断缩小,inception模块组的目的都是将空间结构简化,同时将空间信息转化为

    高阶抽象的特征信息,即将空间维度转为通道的维度。降低了计算量。Inception Module是通过组合比较简单的特征

    抽象(分支1)、比较比较复杂的特征抽象(分支2和分支3)和一个简化结构的池化层(分支4),一共四种不同程度的

    特征抽象和变换来有选择地保留不同层次的高阶特征,这样最大程度地丰富网络的表达能力。

    

from datetime import datetime
import math 
import time
import tensorflow as tf

slim = tf.contrib.slim
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)
# tf.truncated_normal_initializer产生截断的正态分布
########定义函数生成网络中经常用到的函数的默认参数########
def inception_v3_arg_scope(weight_decay=0.00004,  
                           stddev=0.1, 
                           batch_norm_var_collection='moving_vars'):
  batch_norm_params = {
      'decay': 0.9997,  
      'epsilon': 0.001,
      'updates_collections': tf.GraphKeys.UPDATE_OPS,
      'variables_collections': {
          'beta': None,
          'gamma': None,
          'moving_mean': [batch_norm_var_collection],
          'moving_variance': [batch_norm_var_collection],
      }
  }
# 给函数的参数自动赋予某些默认值
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      weights_regularizer=slim.l2_regularizer(weight_decay)):
    with slim.arg_scope(
        [slim.conv2d],
        weights_initializer=trunc_normal(stddev), 
        activation_fn=tf.nn.relu, 
        normalizer_fn=slim.batch_norm, 
        normalizer_params=batch_norm_params) as sc: 
      return sc 
########定义函数可以生成Inception V3网络的卷积部分########
def inception_v3_base(inputs, scope=None):
    
  end_points = {} 
  with tf.variable_scope(scope, 'InceptionV3', [inputs]):
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 
                        stride=1, padding='VALID'):
      # 正式定义Inception V3的网络结构。首先是前面的非Inception Module的卷积层,输入图像尺寸299 x 299 x 3
      net = slim.conv2d(inputs, 32, [3, 3], stride=2, scope='Conv2d_1a_3x3') # 输出尺寸149 x 149 x 32
      net = slim.conv2d(net, 32, [3, 3], scope='Conv2d_2a_3x3')# 输出尺寸147 x 147 x 32
      net = slim.conv2d(net, 64, [3, 3], padding='SAME', scope='Conv2d_2b_3x3') # 输出尺寸147 x 147 x 64
      net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_3a_3x3')# 输出尺寸73 x 73 x 64
      net = slim.conv2d(net, 80, [1, 1], scope='Conv2d_3b_1x1')#输出尺寸 73 x 73 x 80.
      net = slim.conv2d(net, 192, [3, 3], scope='Conv2d_4a_3x3')#输出尺寸 71 x 71 x 192.
      net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_5a_3x3')# 输出尺寸35 x 35 x 192.
#上面一共有5个卷积层,2个池化层,实现了对图片数据的尺寸压缩,并对图片特征进行了抽象
#接下来就是三个连续的Inception模块组
      #第一个模块组
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], 
                        stride=1, padding='SAME'): 
        # 第一个模块组第一个Inception Module,Mixed_5b,共有4个分支
        with tf.variable_scope('Mixed_5b'): 
            with tf.variable_scope('Branch_0'): 
                branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')#输出尺寸35*35*64
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')#输出尺寸35*35*48
                branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')#输出尺寸35*35*64
            with tf.variable_scope('Branch_2'): 
                branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') #输出尺寸35*35*64
                branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')#输出尺寸35*35*96
                branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3') # 输出尺寸35*35*96
            with tf.variable_scope('Branch_3'):  
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3') # 输出尺寸35*35*192
                branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')# 输出尺寸35*35*32
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 将四个分支的输出合并在一起(第三个维度合并,即输出通道上合并)64+64+96+32=256个通道,输出尺寸35*35*256
        # 第一个模块组第二个Inception Module 名称是:Mixed_5c,有4个分支,唯一不同的是第4个分支最后接的是64输出通道
        with tf.variable_scope('Mixed_5c'):   
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')# 输出尺寸35*35*64
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0b_1x1') # 输出尺寸35*35*48
                branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv_1_0c_5x5') # 输出尺寸35*35*64
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') # 输出尺寸35*35*64
                branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')# 输出尺寸35*35*96
                branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3') # 输出尺寸35*35*96
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')# 输出尺寸35*35*192
                branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')# 输出尺寸35*35*64
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 将四个分支的输出合并在一起(第三个维度合并,即输出通道上合并)64+64+96+64=288个通道,输出尺寸35*35*288
        # 第一个模块组第3个Inception Module 名称是:Mixed_5d
        with tf.variable_scope('Mixed_5d'):
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 将四个分支的输出合并在一起(第三个维度合并,即输出通道上合并)64+64+96+64=288个通道,输出尺寸35*35*288
        # 第二个Inception模块组
        #第二个模块组第一个Inception Module 名称是:Mixed_6a,包含3个分支,输入是35*35*288
        with tf.variable_scope('Mixed_6a'): 
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 384, [3, 3], stride=2,
                                       padding='VALID', scope='Conv2d_1a_1x1') # 输出尺寸17*17*384
            with tf.variable_scope('Branch_1'): 
                branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1') #输出尺寸35 * 35 * 64
                branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3') #输出尺寸35*35*96
                branch_1 = slim.conv2d(branch_1, 96, [3, 3], stride=2,
                                       padding='VALID', scope='Conv2d_1a_1x1') # 图片尺寸17*17*96
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                           scope='MaxPool_1a_3x3')        
            net = tf.concat([branch_0, branch_1, branch_2], 3) # 输出尺寸17 x 17 x 768
        # 第二个模块组第二个Inception Module 名称是:Mixed_6b,4个分支
        with tf.variable_scope('Mixed_6b'):  
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')# 输出尺寸17*17*192
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1') # 输出尺寸17 * 17 * 128
                branch_1 = slim.conv2d(branch_1, 128, [1, 7],
                                       scope='Conv2d_0b_1x7') # 输出尺寸17 * 17 * 128
                branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1') # 输出尺寸17 * 17 * 192
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1') # 输出尺寸17 * 17 * 128
                branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0b_7x1')
                branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_0c_1x7')
                branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0d_7x1')
                branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
            with tf.variable_scope('Branch_3'):  
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')# 输出尺寸17 * 17 * 768
                branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1') # 输出尺寸17 * 17 * 192
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
        # 第二个模块组第三个Inception Module 名称是:Mixed_6c,通道为160
        with tf.variable_scope('Mixed_6c'):
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)       
        # 第二个模块组第四个Inception Module 名称是:Mixed_6d
        with tf.variable_scope('Mixed_6d'):
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
        # 第二个模块组第五个Inception Module 名称是:Mixed_6e
        with tf.variable_scope('Mixed_6e'):
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
                branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1')
                branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0c_1x7')
                branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0d_7x1')
                branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            end_points['Mixed_6e'] = net # 将Mixed_6e存储于end_points中,作为Auxiliary Classifier辅助模型的分类
        # 第三个Inception模块包含了3个Inception Mdoule
        # 第三个模块组第一个Inception Module 名称是:Mixed_7a,3个分支
        with tf.variable_scope('Mixed_7a'):  
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1') # 输出尺寸17*17*192
                branch_0 = slim.conv2d(branch_0, 320, [3, 3], stride=2,
                                       padding='VALID', scope='Conv2d_1a_3x3')
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
                branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                branch_1 = slim.conv2d(branch_1, 192, [3, 3], stride=2,
                                       padding='VALID', scope='Conv2d_1a_3x3') # 输出尺寸8*8*192
            with tf.variable_scope('Branch_2'):  
                branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID',
                                           scope='MaxPool_1a_3x3')# 输出尺寸8*8*768
            net = tf.concat([branch_0, branch_1, branch_2], 3)
        # 第三个模块组第二个Inception Module 名称是:Mixed_7b, 这个模块最大的区别是分支内又有分支,4 个分支
        with tf.variable_scope('Mixed_7b'): 
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1') # 输出尺寸8*8*320
            with tf.variable_scope('Branch_1'): 
                branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1') # 输出尺寸8*8*384
                branch_1 = tf.concat([
                    slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                    slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)    
            with tf.variable_scope('Branch_2'): 
                branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                branch_2 = slim.conv2d(
                    branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                branch_2 = tf.concat([
                    slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                    slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                branch_3 = slim.conv2d(
                    branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
        # 第三个模块组第三个Inception Module 名称是:Mixed_7c
        with tf.variable_scope('Mixed_7c'):
            with tf.variable_scope('Branch_0'):
                branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
            with tf.variable_scope('Branch_1'):
                branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                branch_1 = tf.concat([
                    slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                    slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0c_3x1')], 3)
            with tf.variable_scope('Branch_2'):
                branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                branch_2 = slim.conv2d(
                    branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                branch_2 = tf.concat([
                    slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                    slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)
            with tf.variable_scope('Branch_3'):
                branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                branch_3 = slim.conv2d(
                    branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
            net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
        return net, end_points

      设计Inception net的重要原则是图片尺寸不断缩小,Inception模块组的目的都是将空间结构简化,同时将空间信息转化为高阶抽象的特征信息,即将空间维度转为通道的维度。降低了计算量。Inception Module是通过组合比较简单的特征抽象(分支1)、比较比较复杂的特征抽象(分支2和分支3)和一个简化结构的池化层(分支4),一共四种不同程度的特征抽象和变换来有选择地保留不同层次的高阶特征,这样最大程度地丰富网络的表达能力。

########全局平均池化、Softmax和Auxiliary Logits(之前6e模块的辅助分类节点)########
def inception_v3(inputs,
                 num_classes=1000, 
                 is_training=True, 
                 dropout_keep_prob=0.8,
                 prediction_fn=slim.softmax,
                 spatial_squeeze=True, 
                 reuse=None,
                 scope='InceptionV3'): 
  with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes], 
                         reuse=reuse) as scope:   
    with slim.arg_scope([slim.batch_norm, slim.dropout], 
                        is_training=is_training):
      net, end_points = inception_v3_base(inputs, scope=scope)
      with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                          stride=1, padding='SAME'): 
        aux_logits = end_points['Mixed_6e']
        with tf.variable_scope('AuxLogits'):
          aux_logits = slim.avg_pool2d(
              aux_logits, [5, 5], stride=3, padding='VALID',
              scope='AvgPool_1a_5x5')   
          aux_logits = slim.conv2d(aux_logits, 128, [1, 1], 
                                   scope='Conv2d_1b_1x1')
          aux_logits = slim.conv2d(
              aux_logits, 768, [5,5],
              weights_initializer=trunc_normal(0.01), 
              padding='VALID', scope='Conv2d_2a_5x5')      
          aux_logits = slim.conv2d(
              aux_logits, num_classes, [1, 1], activation_fn=None,
              normalizer_fn=None, weights_initializer=trunc_normal(0.001),
              scope='Conv2d_2b_1x1') 
          if spatial_squeeze: 
            aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')
          end_points['AuxLogits'] = aux_logits
      # 处理正常的分类预测逻辑
      with tf.variable_scope('Logits'):
       net = slim.avg_pool2d(net, [8, 8], padding='VALID',
                              scope='AvgPool_1a_8x8')
       net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
       end_points['PreLogits'] = net
       logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
                             normalizer_fn=None, scope='Conv2d_1c_1x1')
       if spatial_squeeze: 

         logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')

      end_points['Logits'] = logits

      end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
  return logits, end_points
########评估网络每轮计算时间########
def time_tensorflow_run(session, target, info_string):
  num_steps_burn_in = 10
  total_duration = 0.0
  total_duration_squared = 0.0 
  for i in range(num_batches + num_steps_burn_in):
    start_time = time.time()
    _ = session.run(target) 
    duration = time.time() - start_time
    if i >= num_steps_burn_in:
      if not i % 10:
        print ('%s: step %d, duration = %.3f' %
             (datetime.now(), i - num_steps_burn_in, duration))
      total_duration += duration  
      total_duration_squared += duration * duration
  mn = total_duration / num_batches 
  vr = total_duration_squared / num_batches - mn * mn
  sd = math.sqrt(vr) 
  print ('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %
         (datetime.now(), info_string, num_batches, mn, sd))
# 测试前向传播性能
batch_size = 32 
height, width = 299, 299 
inputs = tf.random_uniform((batch_size, height, width, 3))
with slim.arg_scope(inception_v3_arg_scope()):
  logits, end_points = inception_v3(inputs, is_training=False)
init = tf.global_variables_initializer() 
sess = tf.Session()
sess.run(init)
num_batches=100
time_tensorflow_run(sess, logits, "Forward")

      虽然输入图片比VGGNet的224*224大了78%,但是forward速度却比VGGNet更快。这主要归功于其较小的参数量,Inception V3网络仅有2500万个参数,不过仍然不到AlexNet的6000万参数量的一半。相比VGGNet的1.4亿参数量就更少了。同时整个网络的浮点计算量为50亿次,比Inception V1的15亿次大了不少,但是相比VGGNet来说不算大。较少的计算量让Inception V3网络变得非常实用,可以轻松地移植到普通服务器上提供快速响应服务,甚至移植到手机上进行实时的图像识别。

2018-11-23 21:41:24.823122: step 0, duration = 3.020
2018-11-23 21:41:58.355705: step 10, duration = 3.119
2018-11-23 21:42:30.775815: step 20, duration = 3.145
2018-11-23 21:43:03.010340: step 30, duration = 3.050
2018-11-23 21:43:35.451022: step 40, duration = 3.020
2018-11-23 21:44:06.066204: step 50, duration = 3.015
2018-11-23 21:44:38.213616: step 60, duration = 3.261
2018-11-23 21:45:10.214312: step 70, duration = 3.231
2018-11-23 21:45:40.544145: step 80, duration = 3.060
2018-11-23 21:46:11.332853: step 90, duration = 3.070
2018-11-23 21:46:38.824652: Forward across 100 steps, 3.170 +/- 0.180 sec / batch

我在CPU上跑得比较慢,因为篇幅问题对backward性能不做测试,Inception V3中有许多设计CNN的思想和Trick值得借鉴:

(1)  Factorization into small convolutions很有效,可以降低参数量,减轻过拟合,增加网络的非线性的表达能力。

(2)  卷积网络从输入到输出,应该让图片尺寸逐渐减小,输出通道数逐渐增加,既让空间结构简化,将空间信息转化为高阶抽象的特征信息。

(3)Inception Module用多个分支提取不同抽象程度的高阶特征的思路很有效,可以丰富网络的表达能力。

参考文献:

Tensorflow黄文坚

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值