Lecture 9 | CNN Architectures
Review: LeNet-5
AlexNet(2012 first CNN)
Input: 227x227x3 images
First layer (CONV1): 96 11x11 filters applied at stride 4
Q: what is the output volume size? Hint: (227-11)/4+1 = 55
Output volume [55x55x96]
Q: What is the total number of parameters in this layer?
Parameters: (11*11*3)*96 = 35K
Second layer (POOL1): 3x3 filters applied at stride 2
Q: what is the output volume size? Hint: (55-3)/2+1 = 27
Output volume: 27x27x96
Q: what is the number of parameters in this layer?
Parameters: 0!
Historical note: Trained on GTX 580 GPU with only 3 GB of memory.
Network spread across 2 GPUs, half the neurons (feature maps) on each GPU.
ImageNet Large Scale Visual Recognition Challenge (ILSVRC) winners
ZFNet(2013)
VGGNet(2014)
deep cnn and small filters
Q: Why use smaller filters? (3x3 conv)
Stack of three 3x3 conv (stride 1) layers has same effective receptive field as one 7x7 conv layer
But deeper, more non-linearities And fewer parameters: 3 * (3^2C^2) vs. 7^2C^2 for C channels per layer
Architecture
VGG16 & VGG19
GoogLeNet
“Inception module”
Naive Inception module
Q: What is the problem with this? [Hint: Computational complexity]
Q1: What is the output size of the 1x1 conv, with 128 filters?
Module input: 28x28x256
Q2: What are the output sizes of all different filter operations?
Q3: What is output size after filter concatenation?
28x28x(128+192+96+256) = 28x28x672
Conv Ops:
[1x1 conv, 128] 28x28x128x1x1x256
[3x3 conv, 192] 28x28x192x3x3x256
[5x5 conv, 96] 28x28x96x5x5x256
Total: 854M ops
Very expensive compute. Pooling layer also preserves feature depth, which means total depth after concatenation can only grow at every layer!
Solution: “bottleneck” layers that use 1x1 convolutions to reduce feature depth
Case Study: GoogLeNet
Revolution of Depth: ResNet(152 layer model/ 2015)
Architecture
What happens when we continue stacking deeper layers on a “plain” convolutional
neural network?
Q: What’s strange about these training and test curves?
[Hint: look at the order of the curves]
Experiment
Improved ResNet
Wide Residual Networks
Aggregated Residual Transformations for Deep Neural Networks (ResNeXt)
Deep Networks with Stochastic Depth
FractalNet: Ultra-Deep Neural Networks without Residuals
Densely Connected Convolutional Networks