AlexNet

最新推荐文章于 2022-09-11 14:35:02 发布

EricMachineLearning

最新推荐文章于 2022-09-11 14:35:02 发布

阅读量763

点赞数

分类专栏：机器学习

机器学习专栏收录该内容

20 篇文章 0 订阅

订阅专栏

一、AlexNet网络简介

AlexNet是较早期的一个卷积神经网络，由于其在ImageNet比赛中的出色表现（top1与top5的error rate分别为37.5%与17%），也掀起了学术界对深度学习的研究热潮，下面结合AlexNet的论文，对AlexNet进行简单的总结，有不足的地方，欢迎指正。

二、alexNet网络结构

AlexNet结构图

AlexNet为8层结构，其中前5层为卷积层，后面3层为全连接层，学习参数6千万个，神经元约有650，000个。
AlexNet在两个GPU同时训练完成。
如图所示，AlexNet第2、4、5层均是与前一层自己GPU内连接，第3层是与前面两层全连接，全连接层是2个GPU全连接。
RPN层在第1、2个卷积层后。
Max pooling层在RPN层以及第5个卷积层后。
ReLU在每个卷积层以及全连接层后。
卷积核大小数量：（由此也可以看出，第二层连接2个GPU其他没有）

conv1: 96 11*11*3
conv2: 256 5*5*48
conv3: 384 3*3*256
conv4: 384 3*3*192
conv5: 256 3*3*192

三、alexNet数据处理

AlexNet的训练数据主要采用ILSVRC2010数据集，其为ImageNet的子集，包含1000类，共1.2million训练图像，50,000验证集，150,000的测试集。
alexNet对于数据的处理方法如下：

初步处理
- 首先将不同分辨率图像变换到256*256：由于ImageNet图像具有不同的分辨率，而网络要求输入的图像大小一致（由于存在全连接层），所以在数据处理中，Alexnet统一将图像变换到256*256，变换方法：首先将图像的短边缩放到256，然后在中间部分截取256*256作为训练数据。
- 减均值：均值采用训练数据按照RGB三分量分别求得。
训练数据处理
- 256*256大小的图像中，随机截取大小为227*227大小的图像
- 取镜像
- 这样使原始数据增加了（256-224）*（256-224）*2 = 2048倍
- 对RGB空间做PCA，然后对主成分做（0,0.1）的高斯扰动，结果使错误率下降1%,公式如下，其中pi,λi” role=”presentation” style=”position: relative;”>pi,λi只计算一次。
  $[p1,p2,p3][α1λ1,α2λ2,α3λ3]” role=”presentation” style=”text-align: center; position: relative;”> [p 1, p 2, p 3] [α 1 λ 1, α 2 λ 2, α 3 λ 3]$
测试数据处理
- 抽取图像4个角和中心的224*224大小的图像以及其镜像翻转共10张图像利用softmax进行预测，对所有预测取平均作为最终的分类结果

四、AlexNet的设计

采用ReLU激活函数
- 训练速度更快，效果更好
- 可以不需要对数据进行normalization
- 适用于大型网络模型的训练
采用LRN层（后面网络用的不多）
- ReLU层后。
- 提高泛化能力，横向抑制。
- top-1与top-5错误精度分别下降了1.4%和1.2%。
  计算公式如下：
  $bx,yi=ax,yi/(k+α∑j=max(0,i−n/2)min(N−1,i+n/2)(ax,yj)2)β” role=”presentation” style=”text-align: center; position: relative;”> b i x, y = a i x, y / (k + α \sum j = max (0, i - n / 2) min (N - 1, i + n / 2) (a j x, y) 2) β$
重叠池化

top-1与top-5 error rate 下降了0.4%与0.3%
采用Dropout
- 随机关闭部分神经元，已减少过拟合
- 训练迭代的次数会增加
2个GPU并行计算

五、网络超参数

梯度随机下降
batchSize == 128
base_lr == 0.01
momentum == 0.9
weight decay = 0.0005

学习率更新方式：
lr_policy:”step”
gamma:0.1
stepsize:100000

网络参数初始化：weights 高斯分布（均值：0，方差0.01） biases： constant 1

六、结果分析

ILSVRC-2010数据集的实验结果如下：

Model	Top-1	Top-5
Sparse coding	47.1%	28.2%
SIFT+FVs	45.7%	25.7%
CNN	37.5%	17%

ILSVRC-2012数据集的实验结果如下：

5 CNNs为用5个CNN的分类结果
CNN*为先利用ImageNet 2011数据集训练，然后在2012数据集fine-tune的结果

Model	Top-1 (val)	Top-5(val)	Top-5(test)
1 CNN	40.7%	18.2%	-
5 CNNs	38.1%	16.4%	16.4%
1 CNN*	39.%	16.6%	-
7 CNNs*	36.7%	15.4%	15.3%

定量分析

通过定量分析，发现2个GPU表现出来不同的特性，1个不在意颜色，另一个在意颜色
网络深度的重要性，去掉任意一层，精度都会下降
边角目标也可以被识别
如果两张图像的特征向量的“欧几里得”距离很近，那么神经网络会任务是相近图像

最后附上Caffe的AlexNet网络配置文件：

网络可视化链接：1.复制网络结构，2.copy到链接中

1、超参数
net: "models/bvlc_alexnet/train_val.prototxt"
test_iter: 1000
test_interval: 1000
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 100000
display: 20
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005
snapshot: 10000
snapshot_prefix: "models/bvlc_alexnet/caffe_alexnet_train"
solver_mode: GPU
   
   1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

2、AlexNet网络结构
name: "AlexNet"
layer {
  name: "data" type: "Data" top: "data" top: "label" include { phase: TRAIN }
  transform_param {
    mirror: true crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" }
  data_param {
    source: "examples/imagenet/ilsvrc12_train_lmdb" batch_size: 256 backend: LMDB }
}
layer {
  name: "data" type: "Data" top: "data" top: "label" include { phase: TEST }
  transform_param {
    mirror: false crop_size: 227 mean_file: "data/ilsvrc12/imagenet_mean.binaryproto" }
  data_param {
    source: "examples/imagenet/ilsvrc12_val_lmdb" batch_size: 50 backend: LMDB }
}
layer {
  name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 }
    bias_filler {
      type: "constant" value: 0 }
  }
}
layer {
  name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" }
layer {
  name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 }
}
layer {
  name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 }
}
layer {
  name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 }
    bias_filler {
      type: "constant" value: 0.1 }
  }
}
layer {
  name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" }
layer {
  name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 }
}
layer {
  name: "pool2" type: "Pooling" bottom: "norm2" top: "pool2" pooling_param { pool: MAX kernel_size: 3 stride: 2 }
}
layer {
  name: "conv3" type: "Convolution" bottom: "pool2" top: "conv3" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 }
    bias_filler {
      type: "constant" value: 0 }
  }
}
layer {
  name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" }
layer {
  name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 }
    bias_filler {
      type: "constant" value: 0.1 }
  }
}
layer {
  name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" }
layer {
  name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 }
    bias_filler {
      type: "constant" value: 0.1 }
  }
}
layer {
  name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" }
layer {
  name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 }
}
layer {
  name: "fc6" type: "InnerProduct" bottom: "pool5" top: "fc6" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 4096 weight_filler { type: "gaussian" std: 0.005 }
    bias_filler {
      type: "constant" value: 0.1 }
  }
}
layer {
  name: "relu6" type: "ReLU" bottom: "fc6" top: "fc6" }
layer {
  name: "drop6" type: "Dropout" bottom: "fc6" top: "fc6" dropout_param { dropout_ratio: 0.5 }
}
layer {
  name: "fc7" type: "InnerProduct" bottom: "fc6" top: "fc7" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 4096 weight_filler { type: "gaussian" std: 0.005 }
    bias_filler {
      type: "constant" value: 0.1 }
  }
}
layer {
  name: "relu7" type: "ReLU" bottom: "fc7" top: "fc7" }
layer {
  name: "drop7" type: "Dropout" bottom: "fc7" top: "fc7" dropout_param { dropout_ratio: 0.5 }
}
layer {
  name: "fc8" type: "InnerProduct" bottom: "fc7" top: "fc8" param { lr_mult: 1 decay_mult: 1 }
  param {
    lr_mult: 2 decay_mult: 0 }
  inner_product_param {
    num_output: 1000 weight_filler { type: "gaussian" std: 0.01 }
    bias_filler {
      type: "constant" value: 0 }
  }
}
layer {
  name: "accuracy" type: "Accuracy" bottom: "fc8" bottom: "label" top: "accuracy" include { phase: TEST }
}
layer {
  name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" }

   
   1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386