图像分类(AlexNet、Vgg、GoogLeNetV1-V3)

1、预备知识

①卷积操作(卷积核其实是一个立方体,64*64*3通过100个3*3的卷积核得到64*64*100,再通过一个3*3的卷积得到64*64*1,这里padding=1,stride=1卷积后图像长宽不变)

我想表达的是,卷积核的默认维度是输入图像的通道数。

②池化操作(降维)

2、AlexNet网络

①5个卷积层+3个全连接层(5个卷积跟5个激活)

②ReLu非线性激活

③Max Pooling池化(一二五层有池化)

④Dropout regularization(丢弃正则化,Fc1,Fc2)

提速,防止过拟合,提高模型泛化能力

⑤LRN(局部响应归一化,一二层用)

举个例子:k,α,β都是超参数,i=10,N=96,n=5(邻域),第i=10卷积核处理结果(也就是通道)在x,y处的特征为a,归一化是其值与第8,9,10,11,12在该对应位置的特征值和有关。

流程输入卷积核卷积结果激活池化(3*3,s=2)LRN
Conv1224*224*348个11*11,s=4(2组)

55*55*48

(单核)

ReLu27*27*48YES
Conv227*27*48128个5*5,s=1,p=2

27*27*128

(单核)

ReLu13*13*128YES

Conv3

(交互)

13*13*256192个3*3,s=1,p=1

13*13*192

(单核)

ReLu  
Conv413*13*192

192个3*3,

s=1,p=1

13*13*192

(单核)

ReLu  
Conv513*13*192

128个3*3,

s=1,p=1

13*13*128

(单核)

ReLu6*6*128 

Fc1

6*6*128*2输出:4096    
Fc24096输出:4096    
Fc34096输出:1000    
SoftMax1000输出:1000    

3、VGG(核分解,5*5->2个3*3,7*7->3个3*3)

①vgg主要是核分解,大的卷积核全部分解成3*3的卷积核

vgg16vgg19
卷积-卷积-池化卷积-卷积-池化
卷积-卷积-池化卷积-卷积-池化
卷积-卷积-卷积-池化卷积-卷积--卷积-卷积-池化
卷积-卷积-卷积-池化卷积-卷积--卷积-卷积-池化
卷积-卷积-卷积-池化卷积-卷积--卷积-卷积-池化
全连接-全连接-全连接全连接-全连接-全连接

4、GoogLeNet

①Inception V1

思想:多尺度卷积核,增加网络宽度,引入1*1卷积降维,引入辅助分类器,取消主分类器全连接层

理解:9个inception组件,2个辅助分类器,1个主分类器,分别在第三个第六个第九个组件后,主分类器没有全连接层,预处理还用到LRN局部相应归一化。

补充:以前接触的都是2*2/2最大池化,没有接触过重叠池化,3*3/2,搞不清楚怎么padding的,有两种参数,SAME和VALID,第一种是不够除,补0,第二种是舍去(这里都是用SAME)例如:3*3--->2*2/s=2池化(SAME:2*2,VALID:1*1),3*3--->2*2/s=1池化(SAME:3*3,VALID:1*1)

输入图片:224*224*3--->通过64个卷积核7*7/s=2/p=3--->112*112*64--->池化---->56*56*64--->LRN---->64个1*1/s=1降维---->56*56*64---->192个3*3/s=1/p=1卷积--->56*56*192--->池化---->28*28*192(到此预处理结束,下面开始第一个Inception模块)输入:28*28*192 

64个1*1卷积--->28*28*64

96个1*1卷积(3*3的卷积预处理)--->28*28*96--->128个3*3的卷积--->28*28*128(p=1)

16个1*1卷积(5*5的卷积预处理)--->28*28*16--->32个5*5的卷积--->28*28*32(p=2)

3*3/s=1池化--->28*28*192---->32个1*1的卷积---->28*28*32(步长是1,SAME,池化后大小不变)

输出:28*28*256

②InceptionV2

思想:在V1的基础上增加了核分解(5*5分解成两个3*3的卷积核),增加了BN批归一化操作

BN(Batch Normalization)批归一化

在深度神经网络训练过程中使输入保存相同的分布,让其分布固定,提高收敛效率。白化操作,使每一层的输出都规范化到N(0,1),正态分布。

比如:Batch=32,64个卷积核,64次批归一化操作,每个卷积核对32张图片得到的结果进行一次批归一化操作。

②InceptionV3

思路:这个变化是最大的,V1-V2基本没怎么变,V3提出来3种Inception组件,非对称分解卷积核。同时取消了2个辅助分类器,添加一个新的辅助分类器(位置不同,具体这个论文中的参数表和实际代码不符,总共11个Inception组件,每个组件的输入输出代码中注释的很全)在V3中优化了池化操作,在降维的时候同时增加通道数(一般是卷积和池化串连,这里用一个Inception组件替代池化操作,第4个和第9个)

类型输入卷积核输出
Conv299*299*332个3*3/s=2149*149*32
Conv149*149*3232个3*3/s=1147*147*32
Conv 147*147*3264个3*3/s=1,p=1147*147*64
Pool(池化)147*147*643*3/s=273*73*64
Conv73*73*6480个1*173*73*80
Conv73*73*80192个3*3/s=171*71*192
Pool71*71*1923*3/s=235*35*192
3*Inception35*35*192

见代码

35*35*288
5*Inception35*35*288

见代码

17*17*768
2*Inception17*17*768见代码8*8*2048
Pool8*8*20488*81*1*2048
linear1*1*2048 1*1*1000
softmax1*1*1000 1*1*1000
def inception_v3_base(inputs, scope=None):
    end_points = {}  # 定义一个字典表保存某些关键节点供之后使用
    with tf.variable_scope(scope, 'InceptionV3', [inputs]):
        # 对三个参数设置默认值
        with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='VALID'):
            # 正式定义Inception V3的网络结构。
            # 输入299 x 299 x 3
            net = slim.conv2d(inputs, 32, [3, 3], stride=2, scope='Conv2d_1a_3x3')
            # 输出149 x 149 x 32
            net = slim.conv2d(net, 32, [3, 3], scope='Conv2d_2a_3x3')
            # 输出147 x 147 x 32
            net = slim.conv2d(net, 64, [3, 3], padding='SAME', scope='Conv2d_2b_3x3')
            # 输出147 x 147 x 64
            net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_3a_3x3')
            # 输出73 x 73 x 64
            net = slim.conv2d(net, 80, [1, 1], scope='Conv2d_3b_1x1')
            # 输出73 x 73 x 80.
            net = slim.conv2d(net, 192, [3, 3], scope='Conv2d_4a_3x3')
            # 输出71 x 71 x 192.
            net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_5a_3x3')
        # 输出35 x 35 x 192.
        '''上面部分代码一共有5个卷积层,2个池化层,实现了对图片数据的尺寸压缩,
        我看论文发现给出的框架图流程与源码不符?依源码为准!!!!
        InceptionV3提出了3个组件,但源码中不是完全相同,只是非常结构相似,卷积核数量不同
        '''
        # Inception模块
        with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'):
            # 第一种类型模块(第一个,4个分支,输入:35*35*192)
            with tf.variable_scope('Mixed_5b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                # 输出尺寸35*35*64
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                    # 输出尺寸35*35*48
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
                # 输出尺寸35*35*64
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    # 输出尺寸35*35*64
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    # 输出尺寸35*35*96
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                # 输出尺寸35*35*96
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    # 输出尺寸35*35*192
                    branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
                # 输出尺寸35*35*32
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出尺寸35*35*256(64+64+96+32)
            # 第一种类型模块(第二个,4个分支,输入:35*35*256)
            with tf.variable_scope('Mixed_5c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                # 输出尺寸35*35*64
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0b_1x1')
                    # 输出尺寸35*35*48
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv_1_0c_5x5')
                # 输出尺寸35*35*64
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    # 输出尺寸35*35*64
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    # 输出尺寸35*35*96
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                # 输出尺寸35*35*96
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    # 输出尺寸35*35*192
                    branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
                # 输出尺寸35*35*64
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出35*35*288(64+64+96+64)
            # 第一种类型模块(第三个,4个分支,输入:35*35*288)
            with tf.variable_scope('Mixed_5d'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出35*35*288(64+64+96+64)

            # 第二种类型模块(第一个,3个分支,输入:35*35*288)(相当于一个池化操作)
            with tf.variable_scope('Mixed_6a'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 384, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_0b_3x3')
                    branch_1 = slim.conv2d(branch_1, 96, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_1x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
                net = tf.concat([branch_0, branch_1, branch_2], 3)
            # 输出17 x 17 x 768(384+96+288)
            # 第二种类型模块(第二个,4个分支,输入:17*17*768)
            with tf.variable_scope('Mixed_6b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 128, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出17*17*768(192+192+192+192,注意这里使用平均池化)
            # 第二种类型模块(第三个,4个分支,输入:17*17*768)
            with tf.variable_scope('Mixed_6c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出17*17*768
            # 第二种类型模块(第四个,4个分支,输入:17*17*768)
            with tf.variable_scope('Mixed_6d'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出17*17*768
            # 第二种类型模块(第五个,4个分支,输入:17*17*768)
            with tf.variable_scope('Mixed_6e'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 192, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
                # 输出17*17*768
                end_points['Mixed_6e'] = net
            # 辅助模型的分类
            # 第三种类型模块(第一个,3个分支,输入:17*17*768)(相当于一个池化操作)
            with tf.variable_scope('Mixed_7a'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_0 = slim.conv2d(branch_0, 320, [3, 3], stride=2,
                                           padding='VALID', scope='Conv2d_1a_3x3')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                    branch_1 = slim.conv2d(branch_1, 192, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_3x3')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
                net = tf.concat([branch_0, branch_1, branch_2], 3)
            # 输出尺寸8*8*(320+192+768)=8*8*1280
            # 第三种类型模块(第二个,4个分支,输入:8*8*1280)区别分支内又有分支
            with tf.variable_scope('Mixed_7b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):  # 第二个分支里还有分支
                    branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = tf.concat([
                        slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                        slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(
                        branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = tf.concat([
                        slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                        slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出8*8*2048(320+(384+384)+(384+384)+192)
            # 第三种类型模块(第三个,4个分支,输入:8*8*2048)
            with tf.variable_scope('Mixed_7c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = tf.concat([
                        slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                        slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0c_3x1')], 3)
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(
                        branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = tf.concat([
                        slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                        slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(
                        branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 输出8*8*2048(320+768+768+192)
            return net, end_points	

 

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值