三、卷积神经网络结构及其发展历程--深度学习EECS498/CS231n

AlexNet

在这里插入图片描述

Output size:

通道数和滤波器数量保持一致,均为64

H/W = (H - K + 2P)/S + 1 = (227-11+4)/4+1=56

Memory(KB):

Number of output elements: C * H* W=64*56 *56=200704; Bytes per element=4 (for 32-bit floating point). KB=200704 * 4/1024 = 784

Parameters(k):

Weight shape = Cout *Cin *K *K=64 *3 *11 *11

Bias shape = 64

Number of weights = 64x3x11x11 + 64 =23296

flop(M)!!!important

number of floating point operations (multiply+add)//since they can be done in one cycle

= (number of output elements) x (ops per element)

=(Cout x H x W) x (Cin xK xK) = 72855552

在这里插入图片描述

此处省略了紧随conv1之后的ReLu

对于pooling:

  1. 不该表channel数量
  2. flop(M) = (number of output positions) * (flops per output position) = (Cout *H *W)x(KxK)=0.4MFLOP,注意和conv相比,pooling的计算代价小到忽略不计

在这里插入图片描述

How does AlexNet design? Trial and Error.

在这里插入图片描述

VGG

Design rules for VGG:

  1. All conv are 3x3 stride 1 pad 1
  2. All max pool are 2x2 stride 2
  3. After pool, double #channels
  4. Using convolutional stages, 5 stages for vgg16
    1. Stage 1: conv-conv-pool
    2. Stage 2: conv-conv-pool
    3. Stage 3: conv-conv-pool
    4. Stage 4: conv-conv-conv-[conv]-pool
    5. Stage 5: conv-conv-conv-[conv]-pool

Why conv 3x3

Option 1: conv(5x5. c->c)

Params: 25c^2 FLOPs: 25C^2HW

Option 2: conv(3x3, c->c) conv(3x3, c->c)

Params: 18c^2 FLOPs: 18c^2HW 感受野相同,参数更少,计算消耗更小;此外,当我们选择了两个conv,我们可以在这两个conv之间插入一个relu,从而带给我们更多的depth和nonlinear computation

why double channels

在这里插入图片描述

注意FLOPS错了

但对pooling之后一半size的输入double通道数,能够使FLOPs保持一致,而conv layers at each spatial resolution take the same amount of computation

GoogLeNet: Inception Module

在这里插入图片描述

local unit with parallel branches that is repeated many times throughout the network.

Use 1*1 Bottleneck layers to reduce channels dimensions before the expensive conv

ResNet

what happens when we go deeper?

在这里插入图片描述

this is an optimization problem. Deeper models are harder to optimize, in particular don’t learn identity functions to emulate shallow models.

->change the network so learning identity functions with extra layers is easy.

在这里插入图片描述

this layer can now easily learn the identity function, if we set the weights of these two conv layers to zero, this block should compute the identity and makes the dnn easier to emulate the shallow networks. And it also help to improve the gradient flow of deep networks because the add gates make one copy of gradient and pass it through the shortcuts.

Learn from VGG: stages, 3x3 conv

Learn from Google: aggressive stem to downsample the input before applying residual blocks. and global pooling to avoid expensive FC

在这里插入图片描述

请添加图片描述
请添加图片描述

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值