CNN 卷积神经网络结构

CNN

cnn每一层会输出多个feature map, 每个Feature Map通过一种卷积滤波器提取输入的一种特征,每个feature map由多个神经元组成,假如某个feature map的shape是m*n, 则该feature map有m*n个神经元。对于卷积层会有kernel, 记录上一层的feature map与当前层的卷积核的权重,因此kernel的shape为(上一层feature map的个数,当前层的卷积核数)。本文默认子采样过程是没有重叠的,卷积过程是每次移动一个像素,即是有重叠的。默认子采样层没有权重和偏置。关于CNN的其它描述不在这里论述,可以参考一下参考文献。只关注如何训练CNN。

CNN网络结构

一种典型卷积网络结构是LeNet-5,用来识别数字的卷积网络。结构图如下(来自Yann LeCun的论文):
LeNet-5
卷积神经网络算法的一个实现文章中,有一个更好看的图:
LeNet-5
该图的输入是一张28*28大小的图像,在C1层有6个5*5的卷积核,因为C1层输出6个(28-5+1)(28-5+1)大小的feature map。然后经过子采样层,这里假设子采样层是对卷积层的均值处理(mean pooling), 其实一般还会有加偏置和激活的操作,为了简化,省略了这两步,只是对卷积层进行一个采样的操作。因此S2层输出的6个feature map大小为(24/2)(24/2).在卷积层C3中,它的输入是6个feature map,与C1不一样(C1只有一个feature map,如果是RGB的话,C1会有三个channel)。C3层有12个5*5卷积核,每个卷积核会与上一层的6个feature map分别做卷积(事实上,一般是选择几种输入feature map来做卷积,而不是全部的feature map),然后对这6个卷积结果求和组成一个新的feature map,即该层会有12个大小为(12-5+1)*(12-5+1)的feature map,这个feature map是经过sigmod 函数处理然后结果下一层S4。
这里写图片描述

图片来源

同理,S4层有12个(与卷积层的feature map数一致)大小为(8/2)*(8/2)的feature map。输出层把S4层的feature mapflatten一个向量,向量长度为12*4*4=192,以该向量作为输入,与下面的其它层全连接,进行分类等操作,也就是说把一张图片变成一个向量,接入到别的网络,如传统的BP神经网络,不过从整体来看,CNN可以看做是一个BP神经网络。在 这里有两张很生动的图来描述这个过程:
这里写图片描述
这里写图片描述

权值共享理解

从代码的实现来看,每个卷积核会与部分或全部的输入(上一层输出)feature map进行卷积求和,但是每个卷积核的权重与一个feature map是一一对应,如上一章节中的C3-S4,说是有12个卷积核,然后就有12个输出feature map,但是每个卷积核与输入的6个feature map的权重都是不一样,即kernel不一样,也就是说每个卷积核的权重与一个feature map是一一对应。至于权值共享的话,对于同一个输入的feature map的神经元patch,用的是同一个卷积核权重,这个是共享的,只在同feature map共享,不在跨feature map共享,只是个人理解,有可能有错,if wrong please correct me.

  • 15
    点赞
  • 76
    收藏
    觉得还不错? 一键收藏
  • 5
    评论
The first CNN appeared in the work of Fukushima in 1980 and was called Neocognitron. The basic architectural ideas behind the CNN (local receptive fields,shared weights, and spatial or temporal subsampling) allow such networks to achieve some degree of shift and deformation invariance and at the same time reduce the number of training parameters. Since 1989, Yann LeCun and co-workers have introduced a series of CNNs with the general name LeNet, which contrary to the Neocognitron use supervised training. In this case, the major advantage is that the whole network is optimized for the given task, making this approach useable for real-world applications. LeNet has been successfully applied to character recognition, generic object recognition, face detection and pose estimation, obstacle avoidance in an autonomous robot etc. myCNN class allows to create, train and test generic convolutional networks (e.g., LeNet) as well as more general networks with features: - any directed acyclic graph can be used for connecting the layers of the network; - the network can have any number of arbitrarily sized input and output layers; - the neuron’s receptive field (RF) can have an arbitrary stride (step of local RF tiling), which means that in the S-layer, RFs can overlap and in the C-layer the stride can differ from 1; - any layer or feature map of the network can be switched from trainable to nontrainable (and vice versa) mode even during the training; - a new layer type: softmax-like M-layer. The archive contains the myCNN class source (with comments) and a simple example of LeNet5 creation and training. All updates and new releases can be found here: http://sites.google.com/site/chumerin/projects/mycnn

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值