Caffe框架的理解(二):详解AlexNet

引言

在2012年的时候,Geoffrey和他学生Alex为了回应质疑者,在ImageNet的竞赛中利用AlexNet一举刷新image classification的记录,奠定了deep learning 在计算机视觉中的地位。这里将利用对这一模型的分析学习caffe的结构。

AlexNet的模型结构

模型的文件在根目录的models/bvlc_reference_caffenet/deploy.prototxt,内容如附录一所示。利用draw_net.py可获得模型的结构图像,输入命令:
python python/draw_net.py models/bvlc_reference_caffenet/deploy-gph.prototxt examples/AlexNet-gph/pic/alexnet.png --rankdir=TB --phase=ALL
得到的图像如附录二所示。

逐层分析模型

1.各层的数据结构

在终端中输入如下命令,准备相关环境:

gph@gph-pc:~ $ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import caffe
>>> import cv2
>>> import cv2.cv as cv
>>> caffe.set_mode_gpu()
>>> caffe_root = '/home/gph/Desktop/caffe-ssd/'
>>> model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy-gph.prototxt'
>>> model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
>>> img_file = caffe_root + 'examples/images/cat.jpg'
>>> 

加载模型:

>>> net = caffe.Net(model_def, model_weights, caffe.TEST)

显示所有的层以及对应的data和diff的维度:

输入命令:

for layer, blob in net.blobs.iteritems():
...   print layer + ' ' + str(blob.data.shape) + ' ' + str(blob.diff.shape)
... 

输出如下:

data (10, 3, 227, 227) (10, 3, 227, 227)
conv1 (10, 96, 55, 55) (10, 96, 55, 55)
pool1 (10, 96, 27, 27) (10, 96, 27, 27)
norm1 (10, 96, 27, 27) (10, 96, 27, 27)
conv2 (10, 256, 27, 27) (10, 256, 27, 27)
pool2 (10, 256, 13, 13) (10, 256, 13, 13)
norm2 (10, 256, 13, 13) (10, 256, 13, 13)
conv3 (10, 384, 13, 13) (10, 384, 13, 13)
conv4 (10, 384, 13, 13) (10, 384, 13, 13)
conv5 (10, 256, 13, 13) (10, 256, 13, 13)
pool5 (10, 256, 6, 6) (10, 256, 6, 6)
fc6 (10, 4096) (10, 4096)
fc7 (10, 4096) (10, 4096)
fc8 (10, 1000) (10, 1000)
prob (10, 1000) (10, 1000)

显示有权重的层

输入命令:

for layer, param in net.params.iteritems():
    print layer + ' ' + str(param[0].data.shape) + ' ' + str(param[1].data.shape)

输出结果为:

conv1 (96, 3, 11, 11) (96,)
conv2 (256, 48, 5, 5) (256,)
conv3 (384, 256, 3, 3) (384,)
conv4 (384, 192, 3, 3) (384,)
conv5 (256, 192, 3, 3) (256,)
fc6 (4096, 9216) (4096,)
fc7 (4096, 4096) (4096,)
fc8 (1000, 4096) (1000,)

2.分析

在caffe中存在两种数据流动:
一种是需要处理的数据,从输入层输入,被各层一次处理,最后到输出层得到输出。这部分数据存储在net.blobs的data中;同时,blob中diff还保存着对应的梯度值,这是我们比较关心的两种数据。
另一种是各层计算时需要用到的参数,也就是权重weights和偏置bias项,存储在net.params[0]和net.params[1]中。

在AlexNet模型中,拥有卷积处理性质的层都会改变处理数据的大小。比如卷积层和池化层,他们都包含kernel_size这一参数,所以有可能改变数据的大小。是否能够改变数据的大小,要看的与卷积相关的参数有kernel_size,pad,stride。计算公式如下y=(x+2*pad-kernel_size)/stride+1

卷积层所存储的权重的维度由该层的kernel_size,group,num_output以及上一层的num_output决定,其值为(本层的num_output,上一层的num_output/group,kernel_size_w,kernel_size_h)。其原因和卷积是如歌进行的相关

全连接层的权重就简单很多了,因为是全链接,参数就和上一层的神经元个数,输出的神经元个数相关,为(num_output,上一层的神经元个数)

附录一:deploy.prototxt内容

name: "CaffeNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "norm1"
  top: "conv2"
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "pool2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "norm2"
  top: "conv3"
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8"
  inner_product_param {
    num_output: 1000
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8"
  top: "prob"
}

附录二:AlexNet模型

这里写图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值