CNN神经网络的个人总结

CNN神经网络的总结:

英文地址

CNN(Convolutional neural network) 简称卷积神经网络,主要是用于图像识别、图像分类、对象检测、人脸识别等。

  • 1.神经网络的输入一般是图片,我们处理图片数据的时候主要是将图片使用Pixel点来表示。一般是使用(h=Height, w=Weight,d=Dimension)来进行表示。例如:一个彩色图片可以表示为6*6*3(3代表的是RGB三种颜色),一个灰色图片就可以使用4*4*1来表示(1代表灰色或者黑色)。
  • 这里写图片描述
  • 2.CNN模型在训练和测试的过程中,每一个图片都会经过卷积层(Kernals)、Pooling、Fully Connected layers(FC),最后要经过Softmax层对对象进行分类,判断出所属类别的概率的可能性。
  • 这里写图片描述
  • 3.池化层的计算如图所示
  • 这里写图片描述
  • Convolution is the first layer to extract features from an input image. Convolution preserves the relationship between pixels by learning image features using small squares of input data. It is a mathematical operation that takes two inputs such as image matrix and a filter or kernal
  • Convolution of an image with different filters can perform operations such as edge detection, blur and sharpen by applying filters. The below example shows various convolution image after applying different types of filters (Kernels).
  • 这里写图片描述

  • 4.Strides(步长)

  • Stride is the number of pixels shifts over the input matrix. When the stride is 1 then we move the filters to 1 pixel at a time. When the stride is 2 then we move the filters to 2 pixels at a time and so on. The below figure shows convolution would work with a stride of 2.
  • 这里写图片描述
  • 5.Padding(填充)
  • Sometimes filter does not fit perfectly fit the input image. We have two options:
    • Pad the picture with zeros (zero-padding) so that it fits (tensorflow中使用 same进行填充,外围进行填充)
    • Drop the part of the image where the filter did not fit. This is called valid padding which keeps only valid part of the image.(tensorflow中使用valide进行提取出有用的特征)
  • 5.Non Linearity(ReLU)去线性化
  • ReLU stands for Rectified Linear Unit for a non-linear operation. The output is ƒ(x) = max(0,x).
  • Why ReLU is important : ReLU’s purpose is to introduce non-linearity in our ConvNet. Since, the real world data would want our ConvNet to learn would be non-negative linear values.
  • 这里写图片描述
  • There are other non linear functions such as tanh or sigmoid can also be used instead of ReLU. Most of the data scientists uses ReLU since performance wise ReLU is better than other two.
  • 6.Pooling Layer(池化层)
  • Pooling layers section would reduce the number of parameters when the images are too large. Spatial pooling also called subsampling or downsampling which reduces the dimensionality of each map but retains the important information. Spatial pooling can be of different types:
    • Max Pooling
    • Average Pooling
    • Sum Pooling
  • Max pooling take the largest element from the rectified feature map. Taking the largest element could also take the average pooling. Sum of all elements in the feature map call as sum pooling.
  • 这里写图片描述
  • 7.Fully Connected Layer(全连接层)
  • The layer we call as FC layer, we flattened our matrix into vector and feed it into a fully connected layer like neural network.
  • 这里写图片描述
  • In the above diagram, feature map matrix will be converted as vector (x1, x2, x3, …). With the fully connected layers, we combined these features together to create a model. Finally, we have an activation function such as softmax or sigmoid to classify the outputs as cat, dog, car, truck etc.,
  • 这里写图片描述
  • 8.Summary
  • Provide input image into convolution layer
  • Choose parameters, apply filters with strides, padding if requires. Perform convolution on the image and apply ReLU activation to the matrix.
  • Perform pooling to reduce dimensionality size
  • Add as many convolutional layers until satisfied
  • Flatten the output and feed into a fully connected layer (FC Layer)
  • Output the class using an activation function (Logistic Regression with cost functions) and classifies images.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值