CNN原理

本文介绍了卷积神经网络(CNN)的基本结构和工作原理,包括卷积层、池化层、正则化层和全连接层。CNN通过参数共享减少参数数量,使用卷积层进行特征提取,池化层进行下采样,全连接层计算分类得分。常见的CNN架构如LeNet、AlexNet、VGGNet和ResNet等被提及。
摘要由CSDN通过智能技术生成

CNN (Convolutional Neural Network)


写作本文的reference

简单阐述

在这里插入图片描述 在这里插入图片描述
拓扑图 模型图

Left: A regular 3-layer Neural Network. Right: A ConvNet arranges its neurons in three dimensions (width, height, depth), as visualized in one of the layers. Every layer of a ConvNet transforms the 3D input volume to a 3D output volume of neuron activations. In this example, the red input layer holds the image, so its width and height would be the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels).

A ConvNet is made up of Layers. Every Layer has a simple API:It transforms an input 3D volume to an output 3D volume with some differentiable function that may or may not have parameters.

一个简单的卷积神经网络(ConvNet)就是有一组层(a sequence of layers)构成的。通常用以下三种层来构建神经网络:卷积层(Convolutional Layer),池化层(Pooling Layer),全连接层(Fully-Connected Layer)

  • 一个CIFAR-10 分类神经网络例子Example Architecture: Overview*. We will go into more details below, but a simple ConvNet for CIFAR-10 classification could have the architecture [INPUT - CONV - RELU - POOL - FC]. In more detail:
    • INPUT [32x32x3] will hold the raw pixel values of the image, in this case an image of width 32, height 32, and with three color channels R,G,B.

    • CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters.

    • RELU layer will apply an elementwise activation function, such as the $ max(0,x) $ thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]).

    • POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12].

    • FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.

In summary

  • 一个卷积网络就是有一组 可以将一个volume转化为另一个volume的 层组成的
  • 目前流行的层有 CONV/ POOL/ RLU/ FC
  • 各个层通过不同的函数来实现将一个volume转化为另一个volume的
  • 有的层有参数(parameter),如 CONV/ FC;有的层没有参数,如 RELU/ POOL
  • 有的层有超参数(hyperparameter),如 CONV/ FC/ POOL;有的层没有,如 RELU

参数(parameter)通常是在模型训练的过程中,我们根据训练集数据自动得到的。超参(hyperparameter)通常是在模型训练前,我们手动设置的,其目的是为了在训练参数的时候让模型的表现更好。

Convolutional Layer 卷积层

卷积层是承担了主要运算量核心构建层

filter on a layer

  • filter 也就是卷积块,典型的如 [ 5 ∗ 5 ∗ 3 ] [5*5*3] [553]大小。

  • receptive field 感受域,也就是filter的size,宽和高的积,如上个例子中的 [ 5 ∗ 5 ] ​ [5*5]​ [55]

  • Local Connectivity 局部连通性。

  • 两个例子阐述计算 connection 或者 weight 的数量

    Example 1. For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 553 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.

    Example 2. Suppose an input volume had size [16x16x20]. Then using an example receptive field size of 3x3, every neuron in the Conv Layer would now have a total of 3320 = 180 connections to the input volume. Notice that, again, the connectivity is local in space (e.g. 3x3), but full along the input depth (20).

Spatial arrangement

Three hyperparameters control the size of the output volume: the depth, stride and zero-padding

  • depth:输出的深度
  • stride:步长 通常取1或者2
  • zero-padding:零填充

计算size of output volume

  • W W W size of input

  • F F F size of filter

  • P P P size of padding

  • S S S stride

size of output volume is ( W + 2 ∗ P − F ) / S + 1 (W + 2*P - F)/S+1 (W+2PF)/S+1

In general, setting zero padding to be P = ( F − 1 ) / 2 P=(F−1)/2 P=(F1)/2 when the stride is S = 1 S=1 S=1 ensures that the input volume and output volume will have the same size spatially.

Constraints on strides. 根据四个参数计算得到的必须是整数

Parameter Sharing
当output volume的每一个值都由不同的filter计算得到的话,参数的个数十分巨大。因此,对于output volume的每一个深度切片(depth slice),使用同一个filter。

Using the real-world example above, we see that there are 555596 = 290,400 neurons in the first Conv Layer, and each has 11113 = 363 weights and 1 bias. Together, this adds up to 290400 * 364 = 105,705,600 parameters on the first layer of the ConvNet alone. Clearly, this number is very high.

With this parameter sharing scheme, the first Conv Layer in our example would now have only 96 unique set of weights (one for each depth slice)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
卷积神经网络(Convolutional Neural Network, CNN)是一种深度学习模型,常用于图像识别和计算机视觉任务。Matlab中提供了丰富的工具和函数来实现CNNCNN的核心原理是通过卷积层、池化层和全连接层来提取图像特征并进行分类。首先,输入图像经过卷积层,其中包含多个卷积核,每个卷积核都是一个小型的矩阵。卷积过程是将卷积核与输入图像的局部区域进行乘积运算,并将结果求和,从而得到图像的特征图。 然后,使用池化层来降采样特征图,减少特征图的大小。常用的池化方式是最大池化,即选取特征图中每个子区域的最大值作为池化后的值。池化操作有助于减少特征的数量,提高模型的计算效率,并增加模型的鲁棒性。 最后,在全连接层中,将池化层的输出连接到神经网络中,以进行分类或预测。全连接层的每个神经元都与上一层的所有神经元连接,通过学习权重参数,将特征映射到最终的输出类别上。通常使用softmax函数对输出进行归一化,得到每个类别的概率。 Matlab中,可以使用深度学习工具箱中的函数来构建和训练CNN模型。通过定义网络结构、设置训练选项和提供数据集,可以使用训练算法(如反向传播)进行模型训练,并使用测试数据集进行模型验证和评估。此外,Matlab还提供了其他功能,例如数据增强、模型微调和迁移学习,以进一步提升CNN的性能。 总之,Matlab提供了简单易用的工具和函数,可以灵活构建和训练CNN模型,并应用于图像识别和计算机视觉等领域。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值