卷积神经网络网络架构计算_卷积神经网络的架构和训练7分

卷积神经网络网络架构计算

This post provides the details of the architecture of Convolutional Neural Network (CNN), functions and training of each layer, ending with a summary of the training of CNN.

这篇文章提供了 卷积神经网络 (CNN) 的体系结构, 每层的功能和训练 的详细信息 ,最后总结了CNN的训练。

  1. The basic CNN architecture consists of: Input->(Conv+ReLU)->Pool->(Conv+ReLU)->Pool-> Flatten->Fully Connected->Softmax->Output

    CNN的基本架构包括: 输入->(Conv + ReLU)->池->(Conv + ReLU)->池->展平->完全连接-> Softmax->输出

  2. The feature extraction is carried out in the Convolutional layer+ReLU and Pooling layers and the classification is carried out in Fully Connected and Softmax layers.

    特征提取在卷积层+ ReLU和池化层中进行,分类在完全连接层和Softmax层中进行。
Image for post

3. First Convolutional Layer:

3.第一卷积层:

Image for post
  • The primary purpose of this layer is to extract features from the input image.

    该层的主要目的是从输入图像中提取特征。
  • The convolution is used to extract features because it preserves the spatial relationship between pixels by learning image features by using small squares of input data.

    卷积 用于提取特征,是因为它通过使用输入数据的小方块学习图像特征来保留像素之间的空间关系。

  • The convolutional layer have the following attributes:-

    卷积层具有以下属性:

i) Convolutional neurons/kernels/filters defined by a width and height(hyper-parameters).

i)由宽度和高度(超参数)定义的卷积神经元/内核/过滤器。

ii) The number of input channels and the output channels(hyper-parameters).

ii)输入通道数和输出通道数(超参数)。

iii) The depth/number of channels of the Convolutional filter/kernel must be equal to the depth/number of channels of the input.

iii)卷积滤波器/内核的深度/通道数必须等于输入的深度/通道数。

Image for post
Photo by Haneen Krimly on Unsplash
Haneen KrimlyUnsplash拍摄的照片
  • Now, the filter/kernel/neuron (a matrix of weights/parameters) slides over the input image (a matrix of image pixels with a depth 3,usually, for the 3 colors- red,green,blue) starting from the left upper hand side of the input image each time covering the number of pixels as the number of weights in the filter/kernel/neuron.

    现在, 滤镜/核/神经元 (权重/参数矩阵)从左上方开始在输入图像(深度为3的图像像素矩阵,通常为红,绿,蓝三种颜色)上滑动每次输入图像的手侧都覆盖像素数作为过滤器/内核/神经元中的权重数。

  • The outcome of each convolution is stored in the matrix,known as a feature map or convolved feature or activation map, whose depth is equal to the number of filters/kernels/neurons used.

    每个卷积的结果都存储在矩阵中,称为特征图卷积特征激活图 其深度等于所使用的过滤器/内核/神经元的数量。

  • The dimensions of the feature map/convolved feature/activation map can be determined as:

    特征图/卷积特征/激活图的尺寸可以确定为:

  • Input image * Filter = Feature Map/Activation Map

    输入图像*过滤器=功能图/激活图

[n x n x nc ] * [f x f x nc ] = [n-f+1 x n-f+1 x m], where,

[nxnx nc] * [fxfx nc] = [n-f + 1 x n-f + 1 xm],其中,

n is the dimension of the matrix of image pixels, f is the dimension of the matrix of weights , nc is the depth of the image and m is the number of filters used.

n是图像像素矩阵的维数, f是权重矩阵的维数, nc是图像深度, m是所用滤镜的数量。

  • The greater the number of filters, the better is the feature extraction.

    过滤器的数量越多,特征提取就越好。
  • The feature map is then made non-linear by using ReLU.

    然后使用ReLU使特征图成为非线性的。

4. ReLU:

4. ReLU:

Image for post
  • ReLU (Rectified Linear Unit) is an element wise operation (applied per pixel) and replaces all negative pixel values in the feature map by zero.

    ReLU (整流线性单位)是逐个元素的操作(应用于每个像素),并将特征图中的所有负像素值替换为零。

  • The purpose of ReLU is to introduce non-linearity in the ConvNet/Convolution Neural Network, since most of the real-world data that ConvNet/Convolution Neural Network learns would be non-linear and the convolution carried out in the First Convolutional Layer is a linear operation, so non-linearity is implemented by a nonlinear function like ReLU.

    ReLU的目的是在ConvNet /卷积神经网络中引入非线性,因为ConvNet /卷积神经网络学习的大多数现实世界数据都是非线性的,并且在第一卷 层中进行的卷积是线性操作,因此非线性是通过ReLU之类的非线性函数实现的。

  • The output of the ReLU is ƒ(x) = max(0,x).

    ReLU的输出为ƒ(x)= max(0,x)。

  • There are other nonlinear functions such as tanh or sigmoid(used in Single Layer Perceptron) that can also be used instead of ReLU. Most of the data scientists use ReLU since performance wise ReLU is better than the other two.

    还有其他一些非线性函数,例如tanhSigmoid (用于单层感知器),也可以代替ReLU使用 。 大多数数据科学家都使用ReLU,因为在性能方面ReLU比其他两个要好。

5. Pooling Layer:

5.池化层:

Image for post
Photo by Jesper Stechmann on Unsplash
Jesper StechmannUnsplash上的 照片
  • The pooling layer reduces the dimensions of the data by combining the outputs of neuron/filter/kernel clusters at one layer into a single neuron/filter/kernel in the next layer.

    合并层通过将一层的神经元/过滤器/内核簇的输出组合到下一层的单个神经元/过滤器/内核中来减小数据的大小。
  • Convolutional networks/ConvNet may include local and global pooling layers.

    卷积网络/ ConvNet可以包括本地和全局池化层。
  • The hyperparameters for this layer are: 1) filter size (f), 2) stride size(s).

    该层的超参数是:1) 过滤器尺寸(f), 2) 步幅尺寸。

  • Spatial pooling (also called subsampling or downsampling) reduces the dimensionality of each map but retains important information. It can be of different types:

    空间池化(也称为子采样或下采样)降低了每个地图的维数,但保留了重要信息。 它可以是不同的类型:
Image for post

i) Max pooling- This pooling technique works better compared to other techniques and hence used more. Here, depending on the hyperparameters, clusters are formed in the feature map and the maximum of each cluster is taken and a resultant matrix is obtained by taking the maximum values. The number of channels/depth of the resultant matrix is the same as that of the feature map. There is no padding here.

i)最大池化-与其他技术相比,这种池化技术效果更好,因此使用更多。 在此,根据超参数 ,在特征图中形成聚类,并取每个聚类的最大值,并通过取最大值来获得结果矩阵。 所得矩阵的通道数/深度与特征图的通道数/深度相同。 这里没有填充。

ii) Average pooling- Here, depending on the hyperparameters, clusters are formed in the feature map and the average of each cluster is taken and a resultant matrix is obtained by taking the average values. The number of channels/depth of the resultant matrix is the same as that of the feature map.

ii)平均池化-在这里,根据超参数 ,在特征图中形成聚类,并获取每个聚类的平均值,并通过取平均值来获得结果矩阵。 所得矩阵的通道数/深度与特征图的通道数/深度相同。

iii) Sum pooling- Here, depending on the hyperparameters, clusters are formed in the feature map and the sum of each cluster is taken and a resultant matrix is obtained by taking the average values. The number of channels/depth of the resultant matrix is the same as that of the feature map.

iii)和池-在这里,根据超参数 ,在特征图中形成聚类,并获取每个聚类的总和,并通过取平均值来获得结果矩阵。 所得矩阵的通道数/深度与特征图的通道数/深度相同。

  • Functions of pooling:

    池功能:

a) Makes the input representations (feature dimension) smaller and more manageable.

a)使输入表示形式(特征尺寸)更小且更易于管理。

b) Reduces the number of parameters and computations in the network, therefore, controlling overfitting.

b)减少网络中参数和计算的数量,因此,控制过度拟合

c) Makes the network invariant to small transformations, distortions and translations in the input image (a small distortion in input will not change the output of pooling — since we take the maximum / average value in a local neighborhood).

c)使网络对于输入图像中的小变形,畸变和平移不变(输入中的小畸变不会改变合并的输出-因为我们在局部邻域中取最大值/平均值)。

d) Helps us arrive at an almost scale invariant representation of our image (the exact term is “equivariant”).

d)帮助我们获得图像的几乎不变的不变表示(确切的术语是“相等”)。

6. Fully-Connected Layer:

6.全连接层:

Image for post
Photo by michael podger on Unsplash
迈克尔·波德杰 ( Michael Podger)Unsplash上的 照片
  • This layer takes an input volume of its preceding layer and outputs an N-dimensional vector, where N is the number of classes that the program has to choose from. Each number in the N dimensional vector represents the probability of a certain class.

    该层接受其上一层的输入量并输出 一个N维向量,其中N是程序必须选择的类数。 N维向量中的每个数字代表某一类的概率。

7. Softmax:

7. Softmax:

Image for post
  • Softmax (also known as softargmax/normalized exponential function/ multi-class logistic regression) is a function that turns a vector of K real values that sum to 1.

    Softmax(也称为softargmax /归一化指数函数/多类logistic回归)是一种将K个实数值加起来为1的向量的函数。
  • The input values can be positive, negative, zero, or greater than one, but the softmax transforms them into values between 0 and 1, so that they can be interpreted as probabilities.

    输入值可以为正,负,零或大于1,但是softmax会将其转换为0到1之间的值,因此可以将它们解释为概率。
  • If one of the inputs is small or negative, the softmax turns it into a small probability, and if an input is large, then it turns it into a large probability, but it will always remain between 0 and 1.

    如果输入之一较小或为负,则softmax会将其变为小概率;如果输入较大,则softmax将其变为大概率,但始终保持在0到1之间。
  • The softmax is very useful because it converts the outcomes of the previous layer to a normalized probability distribution, which can be displayed to a user or used as input to other systems. For this reason it is usual to append a softmax function as the final layer of the convolutional neural network.

    softmax非常有用,因为它将前一层的结果转换为归一化的概率分布,可以将其显示给用户或用作其他系统的输入。 因此,通常将softmax函数附加为卷积神经网络的最后一层。

The overall training process of the Convolution Neural Network may be summarized as below:

卷积神经网络的整体训练过程可以总结如下:

Image for post
Photo by Prateek Katyal on Unsplash
Prateek KatyalUnsplash上的 照片
  • Step 1: We initialize all filters and parameters / weights with random values

    步骤1:我们使用随机值初始化所有过滤器和参数/权重

  • Step 2: The network takes a training image as input, goes through the forward propagation step (convolution, ReLU and pooling operations along with forward propagation in the Fully Connected layer) and finds the output probabilities for each class.

    步骤2:网络将训练图像作为输入,进行正向传播步骤(卷积,ReLU和池化操作以及全连接层中的正向传播),并找到每个类别的输出概率。

  • Step 3: Calculate the total error at the output layer:

    步骤3:计算输出层的总误差:

  • Total Error = ∑ ½ (target probability — output probability) ²

    总误差= ∑½(目标概率-输出概率)²

  • Step 4: Use Backpropagation to calculate the gradients of the error with respect to all weights in the network and use gradient descent to update all filter values / weights and parameter values to minimize the output error.

    步骤4:使用反向传播来计算误差相对于网络中所有权重的梯度 ,并使用梯度下降来更新所有滤波器值/权重和参数值,以最大程度地减少输出误差。

For queries, feel free to write in the comment💬 section below. You can connect with me on LinkedIn !!

对于查询,请随时在下面的评论💬部分中编写。 您可以在 LinkedIn 上与我联系

Thank you for reading!Have a great day ahead😊

谢谢您的阅读!祝您有美好的一天 😊

翻译自: https://medium.com/analytics-vidhya/architecture-and-training-of-convolutional-neural-networks-7-points-98eef5ef546f

卷积神经网络网络架构计算

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值