2012年,AlexNet横空出世,以极大优势赢得了ImageNet 2012图像识别挑战赛的冠军,也引发研究人员对早期神经网络、卷积神经网络的思考和再研究。至此,卷积神经网络开始领衔掀起此轮人工智能浪潮。这篇文章将简要介绍卷积神经网络的发展历程以及其中涉及到的经典论文。
涉及的论文已在以下仓库中分享:Awesome-CNN-Papers
卷积神经网络的前身与早期发展
这阶段的卷积神经网络发展为现代卷积神经网络的蓬勃发展提供了必要的理论基础。
-
1980年日本学者福岛邦彦(Kunihiko Fukushima)提出神经认知机模型Neocognitron;福岛邦彦因此获得 2021 年度鲍尔奖「Bower Award and Prize for Achievement in Science」,获奖理由为:通过发明第一个深度卷积神经网络Neocognitron将神经科学原理应用于工程的开创性研究,这是对人工智能发展的关键贡献。
Fukushima K, Miyake S. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition[M].Competition and cooperation in neural nets. Springer, Berlin, Heidelberg, 1982: 267-285. -
1989年Yann LeCun提出第一个真正意义上的CNN:LeNet 1989。
LeCun Y, Boser B, Denker J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural computation, 1989, 1(4): 541-551. -
1998年Yann LeCun进一步介绍了LeNet(又称LeNet-5),影响力巨大。
LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
现代卷积神经网络的基本架构与经典模块的提出
-
2012年ILSVRC(分类)冠军:AlexNet,掀起深度学习计算机视觉狂潮
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C].Advances in neural information processing systems. 2012: 1097-1105. -
2013年ILSVRC(分类)冠军:ZFNet
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C].European conference on computer vision. Springer, Cham, 2014: 818-833. -
2014年ILSVRC(分类)冠军:GoogLeNet,提出Inception结构
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]. Cvpr, 2015. -
2014年ILSVRC(分类)亚军:VGGNet,亮点是对网络深度的研究
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014. -
2015年ILSVRC(分类)冠军:ResNet,提出Residual结构
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C].Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778. -
2016年Google团队结合了Inception结构与Residual 结构,提出Inception-Residual Net
Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C].AAAI. 2017, 4: 12. -
2016年何凯明提出新的ResNet的想法:Identity Mapping
He K, Zhang X, Ren S, et al. Identity mappings in deep residual networks[C].European Conference on Computer Vision. Springer, Cham, 2016: 630-645. -
2017年DenseNet
Huang G, Liu Z, Weinberger K Q, et al. Densely connected convolutional networks[C].Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, 1(2): 3.
卷积注意力机制的探索与完善
- 2017年ILSVRC(分类)冠军:SENet(Squeeze-and-Excitation Networks),提出了Squeeze-and-Excitation Block,网络结合SE Block和Res Block
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
轻量级卷积神经网络的发展
2016年以来,轻量级卷积神经网络的研究开始逐渐浮现,为视觉深度学习模型在移动设备上的应用提供条件。
-
2016年MobileNet
Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017. -
2016年ShuffleNet
Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[J]. arXiv preprint arXiv:1707.01083, 2017. -
2016年Xception【注:Xception目标并不是使卷积神经网络轻量化,而是在不增加网络复杂度的情况下提升性能,但其中使用的depthwise convolution思想是MobileNet等轻量级卷积神经网络的关键,故也列在这里】
Chollet F. Xception: Deep learning with depthwise separable convolutions[J]. arXiv preprint, 2017: 1610.02357. -
2016年ResNeXt【注:ResNeXt也是为了在不增加网络复杂度的情况下提升性能,列在此处的原因与Xception相同】
Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C].Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017: 5987-5995. -
2018年MobileNet V2
Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4510-4520. -
2018年ESPNet【ESPNet这篇文章不是纯粹介绍CNN网络的,而是为语义分割任务设计的,但是其CNN网络也是轻量的。】
Mehta S, Rastegari M, Caspi A, et al. Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 552-568. -
2018年ShuffleNet V2
Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131. -
2018年ESPNetV2
Mehta S, Rastegari M, Shapiro L, et al. ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network[J]. arXiv preprint arXiv:1811.11431, 2018.