2 AlexNet

1 AlexNet

《ImageNet Classification with Deep Convolutional Neural Networks》
https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

1.1 综述及训练

数据集:ImageNet LSVRC-2010 、120万图像、1000类别
错误率:top-1 37.5%, top-5 17.0%
架构:5个卷积层 + 3个池化 + 3个全连接层
防止过拟合:Dropout+数据增强

数据处理:1. img短边resize到256 。2. 另外一边在中间截取256个像素[256,256]。3.每个通道减去其均值。

训练数据:1. img[256,256,3。2. 随机截取[224,224,3]。3. 水平翻转

测试数据:1.分别以img的四个角及中心点[(0,0),(0,256),(256,0),(265,256),(256/2,256/2)]截取[224,224] 尺寸的图像。2.对上一步得到的5张图像镜像翻转。3.对十张照片的结果平均得到最终结果。

权重初始化:每一层w_N(0,001)。第2、4、5卷积层及全连接的bias初始化值为1,其余层的bias值为0。

训练:1. mini_batch[batch_size=128]。2.optimal(momentum,mu=0.9。3.L2正则化(mu=0.9,lambda=0.0005[正则化系数])。 4 .所有层学习率相同, lr=0.01,当loss不再下降时,lr/=10,在训练过程中,学习率一共下降了三次。5.
img_num=1200000,epoches=90

1.2 AlexNet创新

  1. 非线性激活ReLU。计算简单,提高训练速度 --> 当输入数据大于0使,导数为1,缓解梯度消失 --> 输入数据小于0,梯度为0,让一些神经元失活,起到正则化的效果,但同时也会损失掉一部分信息。
  2. 多GPU训练。减少参数量+提高训练速度。实验数据表示,two-GPU方案会比只用one-GPU跑半个上面大小网络的方案,在准确度上提高了1.7%的top-1和1.2%的top-5。当然,one-GPU的半个网络和two-GPU网络结构是不一样的,two-GPU有指标上的提升也并不奇怪。
  3. 局部响应归一化LRN。当像素点比临近像素点大时,像素点变大;反之,变小。 top-1 and top-5 error rates by 减少1.4% and 1.2%
  4. MaxPooling。strides < kernel_size,重叠池化层,避免了平均池化带来的模糊化效果。实验表示使用 带交叠的Pooling的效果比的传统要好,在top-1和top-5上分别提高了0.4%和0.3%,在训练阶段有避免过拟合。但野增加计算量,带来冗余信息。
  5. Dropout。神经元随机失活,减少神经元之间的相互依赖,从而确保提取出相互独立的重要特征。
  6. 数据增强(Data Augmentation)。随机裁剪 + 镜像翻转 + 对RGB通道做PCA,主成分做一个N(0, 0.1)的高斯扰动, top-1错误率下降1%。

I x y = [ I x y R , I x y G , I x y B ] T + [ p 1 , p 2 , p 3 ] [ α 1 λ 1 , α 2 λ 2 , α 3 λ 3 ] T I_{xy}=[I_{xy}^R,I_{xy}^G,I_{xy}^B]^T+[p_1,p_2,p_3][\alpha_1\lambda_1,\alpha_2\lambda_2,\alpha_3\lambda_3]^T Ixy=[IxyR,IxyG,IxyB]T+[p1,p2,p3][α1λ1,α2λ2,α3λ3]T

α   N ( 0 , 0.1 ) \alpha ~ N(0,0.1) α N(0,0.1)

2 网络构架

在这里插入图片描述

input[227,227,3]
↓
Conv2(k=11,f=2*48,s=4) + ReLU       [55,55,96]
↓
MaxPool1(k=3,s=2)                   [27,27,96]
↓
Norm1(local_size=5)
↓
Conv2(k=5,f=2*128,s=1,p=2) + ReLU   [27,27,256]
↓
MaxPool2(k=3,s=2)                   [13,13,256]
↓
Norm2(local_size=5)
↓
Conv3(k=3,f=2*192,s=1,p=1) + ReLU   [13,13,384]   concate
↓
Conv4(k=3,f=2*192,s=1,p=1) + ReLU   [13,13,384]
↓
Conv5(k=3,f=2*128,s=1,p=1) + ReLU   [13,13,256]
↓
MaxPool3(k=3,s=2)                   [6,6,256]
↓
FC(4096) + ReLU                                   concate
↓
Dropout(rate=0.5)
↓
FC(4096) + ReLU                                   concate
↓
Dropout(rate=0.5)
↓
FC(1000) + Softmax                                concate

2 代码(没有添加LRN)

from keras.layers import Input,Conv2D, Concatenate,Flatten, MaxPooling2D,Dense,Dropout
from keras.models import Model

def alexNet(input):
    x1 = Conv2D(filters=48, kernel_size=(11,11),
                activation='relu',
                padding='valid',
                strides= 4,
                name='conv1_1')(input)
    x1 = MaxPooling2D(pool_size=(3,3),strides=(2, 2),padding='valid')(x1)
    x1 = Conv2D(filters=128, kernel_size=(5,5),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv1_2')(x1)
    x1 = MaxPooling2D(pool_size=(3,3),strides=(2, 2),padding='valid')(x1)

    x2 = Conv2D(filters=48, kernel_size=(11,11),
                activation='relu',
                padding='valid',
                strides= 4,
                name='conv2_1')(input)
    x2 = MaxPooling2D(pool_size=(3,3),strides=(2, 2),padding='valid')(x2)
    x2 = Conv2D(filters=128, kernel_size=(5,5),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv2_2')(x2)
    x2 = MaxPooling2D(pool_size=(3,3),strides=(2, 2),padding='valid')(x2)

    x = Concatenate(axis=3)([x1,x2])
    x1 = Conv2D(filters=192, kernel_size=(3,3),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv3_1')(x)
    x1 = Conv2D(filters=192, kernel_size=(3,3),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv4_1')(x1)
    x1 = Conv2D(filters=128, kernel_size=(3,3),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv5_1')(x1)
    x1 = MaxPooling2D(pool_size=(3,3),strides=(2, 2),padding='valid')(x1)

    x2 = Conv2D(filters=192, kernel_size=(3,3),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv3_2')(x)
    x2 = Conv2D(filters=192, kernel_size=(3,3),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv4_2')(x2)
    x2 = Conv2D(filters=128, kernel_size=(3,3),
                activation='relu',
                padding='same',
                strides= 1,
                name='conv5_2')(x2)
    x2 = MaxPooling2D(pool_size=(3,3),strides=(2, 2),padding='valid')(x2)

    x = Concatenate(axis=3)([x1,x2])
    x = Flatten()(x)


    x1 = Dense(2048,activation='relu')(x)
    x2 = Dense(2048,activation='relu')(x)
    x = Concatenate(axis=0)([x1,x2])
    x = Dropout(rate=0.5)(x)


    x1 = Dense(2048,activation='relu')(x)
    x2 = Dense(2048,activation='relu')(x)
    x = Concatenate(axis=0)([x1,x2])
    x = Dropout(rate=0.5)(x)

    x = Dense(1000,activation='softmax')(x)
    return x



if __name__ == '__main__':
    input = Input(shape=[227,227,3])
    output = alexNet(input)
    model = Model(input,output)
    print(model.summary())

'''
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 227, 227, 3)  0                                            
__________________________________________________________________________________________________
conv1_1 (Conv2D)                (None, 55, 55, 48)   17472       input_1[0][0]                    
__________________________________________________________________________________________________
conv2_1 (Conv2D)                (None, 55, 55, 48)   17472       input_1[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 27, 27, 48)   0           conv1_1[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D)  (None, 27, 27, 48)   0           conv2_1[0][0]                    
__________________________________________________________________________________________________
conv1_2 (Conv2D)                (None, 27, 27, 128)  153728      max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2_2 (Conv2D)                (None, 27, 27, 128)  153728      max_pooling2d_3[0][0]            
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 13, 13, 128)  0           conv1_2[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D)  (None, 13, 13, 128)  0           conv2_2[0][0]                    
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 13, 13, 256)  0           max_pooling2d_2[0][0]            
                                                                 max_pooling2d_4[0][0]            
__________________________________________________________________________________________________
conv3_1 (Conv2D)                (None, 13, 13, 192)  442560      concatenate_1[0][0]              
__________________________________________________________________________________________________
conv3_2 (Conv2D)                (None, 13, 13, 192)  442560      concatenate_1[0][0]              
__________________________________________________________________________________________________
conv4_1 (Conv2D)                (None, 13, 13, 192)  331968      conv3_1[0][0]                    
__________________________________________________________________________________________________
conv4_2 (Conv2D)                (None, 13, 13, 192)  331968      conv3_2[0][0]                    
__________________________________________________________________________________________________
conv5_1 (Conv2D)                (None, 13, 13, 128)  221312      conv4_1[0][0]                    
__________________________________________________________________________________________________
conv5_2 (Conv2D)                (None, 13, 13, 128)  221312      conv4_2[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D)  (None, 6, 6, 128)    0           conv5_1[0][0]                    
__________________________________________________________________________________________________
max_pooling2d_6 (MaxPooling2D)  (None, 6, 6, 128)    0           conv5_2[0][0]                    
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 6, 6, 256)    0           max_pooling2d_5[0][0]            
                                                                 max_pooling2d_6[0][0]            
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 9216)         0           concatenate_2[0][0]              
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 2048)         18876416    flatten_1[0][0]                  
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 2048)         18876416    flatten_1[0][0]                  
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 2048)         0           dense_1[0][0]                    
                                                                 dense_2[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 2048)         0           concatenate_3[0][0]              
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 2048)         4196352     dropout_1[0][0]                  
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 2048)         4196352     dropout_1[0][0]                  
__________________________________________________________________________________________________
concatenate_4 (Concatenate)     (None, 2048)         0           dense_3[0][0]                    
                                                                 dense_4[0][0]                    
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 2048)         0           concatenate_4[0][0]              
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 1000)         2049000     dropout_2[0][0]                  
==================================================================================================
Total params: 50,528,616
Trainable params: 50,528,616
Non-trainable params: 0
__________________________________________________________________________________________________
None

Process finished with exit code 0
'''

3 dropout函数

  1. Dropout:论文中说多个模型预测结果求求平均,预测结果更准确。但是训练多个模型使成本增加,Dropout就很好解决这个问题。每次训练随机使神经元失活,等于每次训练一个模型,
r_j^l (-)Bernoulli(p)

y^{l} =  r^l * y^l

z_i^{l+1} = w_i^{l+1}* y^{l} + b_i^{l+1}

 y_i^{l+1} = f(z_i^{l+1}) 

训练过程:第l层神经元以p的概率保存,参数值由w变为pw。
测试过程:神经元w乘以p。这样神经元期望值不变。
Dropout有正则化的效果:降低神经元之间的相关性。

  1. 代码 (参考:https://zhuanlan.zhihu.com/p/38200980)
# dropout函数的实现
def dropout(x, level):
    if level < 0. or level >= 1: #level是概率值,必须在0~1之间
        raise ValueError('Dropout level must be in interval [0, 1[.')
    retain_prob = 1. - level

    # 我们通过binomial函数,生成与x一样的维数向量。binomial函数就像抛硬币一样,我们可以把每个神经元当做抛硬币一样
    # 硬币 正面的概率为p,n表示每个神经元试验的次数
    # 因为我们每个神经元只需要抛一次就可以了所以n=1,size参数是我们有多少个硬币。
    random_tensor = np.random.binomial(n=1, p=retain_prob, size=x.shape) #即将生成一个0、1分布的向量,0表示这个神经元被屏蔽,不工作了,也就是dropout了
    print(random_tensor)

    x *= random_tensor
    print(x)
    x /= retain_prob

    return x

#对dropout的测试,大家可以跑一下上面的函数,了解一个输入x向量,经过dropout的结果  
x=np.asarray([1,2,3,4,5,6,7,8,9,10],dtype=np.float32)
dropout(x,0.4)

4 数据处理

把最短边resize到224,宽高比例不变+随机裁剪 + 镜像翻转 + 对RGB通道做PCA,主成分做一个N(0, 0.1)的高斯扰动

# 1 把最短边resize到224,宽高比例不变
def resize_image(image, size):
    iw, ih = image.size
    w, h = size
    scale = min(w/iw, h/ih)
    nw = int(iw*scale)
    nh = int(ih*scale)
    image = image.resize((nw,nh), Image.BICUBIC)
    return new_image

# 2 随机裁剪

# 3 镜像翻转
image = image.transpose(Image.FLIP_LEFT_RIGHT)

# 4  对RGB通道做PCA,主成分做一个N(0, 0.1)的高斯扰动

5 论文精华

  1. Dropout
  2. 数据增强
  3. 分组卷积
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值