AlexNet
Krizhevsky et al. "ImageNet classification with deep convolutional Neural Networks " NIPS 2012
动机(Why)
ImageNet 图像分类竞赛中现有模型准确度不足,为了刷点
方法(How)
- 8层卷积神经网络
- 使用ReLU,比Sigmoid效率高上很多
- 数据增强:沿PCA主方向乘以随机系数向量,扩充数据
- 使用Local Response Normalization 局部响应归一化;作用在同层中相邻的通道之间,之后的VGG论文指出没啥意义,徒增计算量。
- 防止过拟合:使用重叠池化。后来都不这么做了,进一步说明浅层网络的超参调节策略不能迁移到深层网络中
- 防止过拟合:dropout 0.5, 测试阶段失活一般神经元,测试阶段使用所有的神经元,但是将神经元的输出乘以0.5
- minibatch size 128,按一批的平均梯度进行更新
- SGD Momentum 0.9
- 卷积层的偏置项为1,鼓励ReLU进行正向激活,其他层bias为0,weights都为N(0,0.01)
- Learning rate 1e-2, reduced by 10
- L1 weight decay 5e-4;
- 测试的时候:7 CNN ensembles
拓展应用
- 图像分类
英文表达
- Current approaches to object recognition make essential use of machine learning methods.
- However, the immense complexity of the object recognition task means that this problem cannot be specified even by a dataset as large as ImageNet, so our model should also have lots of prior knowledge to compensate for all the data we don’t have. (尽管解决巨复杂的物体识别任务,不能仅依靠超大的数据集,还需要模型有先进的知识来弥补未知的数据上的识别能力
- Despite the attractive qualities of CNNs, and despite the relative efficiency of their local architecture, they have still been prohibitively expensive to apply in large scale to high-resolution images. (CNN虽然有极具吸引力的特性和高效简单的结构,但训练大规模高分辨率图像还是过分高昂
- The specific contributions of this paper are as follows:
- In all, there are roughly 1.2 million training images, 50,000 validation images, and 150,000 testing images.
实验设计
- 对比和其他分类模型的Top1,Top5准确率
- 剥离实验,发现层数越多越好
- 双GPU并行训练
- 对倒数第二层的dim4096向量,找到欧式距离最近的几个图,发现相似
优缺点分析
优点:
-
在以往的网络中加深了深度,又采用多种抗过拟合的方法
-
采用卷积操作,更好提取特征
-
确实有非常亮眼的准确率提升
缺点:
1. 并没有说明白这个神经网络所采用的一些方法为什么好,缺少解释