【CS231n】五、卷积神经网络简介CNN学习笔记_一只神秘的大金毛

本文链接：https://blog.csdn.net/Mys_GoldenRetriever/article/details/109655215

1、历史简介

The Mark 1 Perceptron machine was the first implementation of the perceptron algorithm. 其只有feed forward的调参过程，还没有back propagation的过程。

Back propagation由Rumelhart于1986年提出，文献中已经可以看到链式法则的推导过程。

First strong results，由Hinton的lab的学生Alex使用的AlexNet赢得了2012年ImageNet冠军，其结构与LeCun用在MNIST上的结构没有实质性的变化，只是稍微改变了一些层的结构。论文《 ImageNet Classification with Deep Convolutional Neural Network》，另一篇博文给出了中文翻译： AlexNet中文翻译。

因此，学术界和工业界都普遍意识到了卷积神经网络的强大学习能力，并针对这一种神经网络展开了广泛研究。

2、卷积层 conv_layer

对于全连接层（Fully Connected Layer）， a 32x32x3 image are going to be stretched to 3072x1, 系数（权重）矩阵比如是10x3072, activation的输出Wx则为10x1。

对于卷积层（Convolution Layer），我们想要保存图像的空间结构（即二维结构），使用不同的5x5x3的filter（滤波器，或卷积核）以一定步长（stride）滑过整个image，与每一个与之对应的区域（感受野“receptive filed”）进行点积，产生被不同信息过滤的activation map。

输出大小：（N-F）/stride +1。其中N是image尺寸，F是filter尺寸。对于图像边缘，往往会添加一圈0数据，来保留边缘数据的信息，也可以用作保持输出与输入image大小相同，此时输出大小：（N-F+pad*2）/stride+1。

3、池化层（Pooling Layer）

To make the representations smaller and more manageable. 实际上是一个信息简化压缩的降采样（down sampling）过程。

主要有两种方式：max pooling 和 average pooling。分别取activation map的每一块（例如2x2）的最大值和平均值。更多使用max pooling，原因是activation map的意义在于表示了该区域被滤波器激活的程度，其值越大越能够突出特征，所以activation map也叫做 feature map。