卷积神经网络模型建立_用卷积神经网络建立自己的模型

本文介绍了如何建立卷积神经网络模型,详细解析了从理论到实践的过程,帮助读者理解并应用卷积神经网络进行深度学习任务。
摘要由CSDN通过智能技术生成

卷积神经网络模型建立

A neural network is a series of algorithms that endeavors to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.

神经网络是一系列算法,旨在通过模仿人脑操作方式的过程来识别一组数据中的潜在关系。

什么是卷积神经网络 (What is a convolution neural network)

A convolutional neural network (CNN) is a type of artificial neural network used in image recognition and processing that is specifically designed to process pixel data.CNNs are powerful image processing, artificial intelligence that uses deep learning to perform both generative and descriptive tasks, often using machine vision that includes image and video recognition, along with recommender systems and natural language processing.

卷积神经网络(CNN)是一种专门用于图像识别和处理的人工神经网络,专门用于处理像素数据.CNN是功能强大的图像处理,人工智能,使用深度学习来执行生成和描述性任务,通常使用机器视觉,包括图像和视频识别,以及推荐系统和自然语言处理。

为什么我们需要卷积 (Why we need convolution)

  1. Parameter sharing — feature detectors can be used in all over the image

    参数共享-特征检测器可在整个图像中使用
  2. Sparsity of connections — Each output value in only depend on small number of input values

    连接的稀疏性—每个输出值仅取决于少量的输入值

如何做卷积 (How to do convolution)

Image for post
fig 1 — Convolution operation
图1 —卷积运算

Convolution is overlay the filter into the input and get the summation.

卷积将滤波器覆盖到输入中并获得总和。

大步走 (Stride)

Image for post
fig 2 — stride
图2 —大步前进

Stride value is how much cells we are going to shift the filter to the right to get the next output value.

步幅值是我们要将滤波器向右移动以获得下一个输出值的单元格数量。

填充 (Padding)

Image for post
fig 3 — padding
图3 —填充

Padding has two main benefits,

填充有两个主要好处,

  1. Determine the border of the image

    确定图像的边框
  2. Use convolution without necessarily shrinking the height and width of the volumes.

    使用卷积,而不必缩小卷的高度和宽度。

卷积 (Convolution over volume)

Image for post
fig 4 — convolution over volume
图4 —卷积

Number of filters determines how many channels will be on the output.

滤波器的数量决定了输出中将有多少个通道。

Other than convolutional layers CNN has pooling layers and activation layers.

除卷积层外,CNN还具有池化层和激活层。

汇聚层 (Pooling layer)

Pooling layer is used to reduce the size of the representations and to speed up calculations, as well as to make some of the features it detects a bit more robust.

池化层用于减小表示的大小并加快计算速度,并使它检测到的某些功能更健壮。

There are two main types of pooling layers.

有两种主要类型的池化层。

  1. Max pooling — get the maximum value contained in the window

    最大池化-获取窗口中包含的最大值
  2. Avg pooling — get the average value from the window

    平均池化-从窗口获取平均值
Image for post
fig 5 — pooling layers
图5 –池化层

When you do the pooling, it doesn’t change the number of channels. It only reduces the width and the height.

进行池化时,它不会更改通道数。 它仅减小宽度和高度。

激活功能层 (Activation function layer)

The purpose of the activation function is to introduce non-linearity into the output of a neuron.

激活函数的目的是将非线性引入神经元的输出中。

Image for post
fig 6 — activation functions
图6 —激活功能

Simple convolution neural network consists of 3 main components.

简单卷积神经网络由3个主要部分组成。

  1. Forward pass

    前传
  2. Final layer calculation

    最终层计算
  3. Backward pass

    后退通行证

Forward pass is the process of calculating the output values from first layer to last layer. In the final layer, loss function is calculated using output values. Backward pass is the process of calculating the derivatives using loss function and updating the bias and weight values.

正向传递是计算从第一层到最后一层的输出值的过程。 在最后一层 ,使用输出值计算损失函数。 向后传递是使用损失函数计算导数并更新偏差和权重值的过程。

Image for post
fig 7 — forward and backward pass
图7-前进和后退
Image for post
fig 8 — sample network
图8 —示例网络

Well known architectures in CNN

CNN中的知名架构

经典网络:LeNet — 5 (Classic Network: LeNet — 5)

LeNet-5 is using 32*32 gray-scale image as its input. LeNet-5 consists with two convolution layers with average pooling followed by 2 fully connected layers. Finally a softmax layer to determine the output.

LeNet-5使用32 * 32灰度图像作为输入。 LeNet-5由两个具有平均池化的卷积层组成,然后是2个完全连接的层。 最后是softmax层,以确定输出。

Image for post
fig 9 — LeNet-5
图9 — LeNet-5

经典网络:AlexNet (Classic Network: AlexNet)

AlexNwt is using 227*227 RGB images as its input. Single RGB images consist of 3 channels. It has 5 convolution layers and 3 max pooling layers. Then it is followed by 3 fully connected layers. Finally a softmax layer to determine the output.

AlexNwt使用227 * 227 RGB图像作为输入。 单个RGB图像包含3个通道。 它具有5个卷积层和3个最大池化层。 然后是3个完全连接的层。 最后是softmax层,以确定输出。

Image for post
fig 10 — AlexNet
图10 — AlexNet

经典网络:YOLO (Classic Network: YOLO)

YOLO stands for you only look once. Input image size of YOLO is 448*448 RGB images. YOLO architecture has many versions. YOLO, YOLOv2, tiny-YOLO are few of them. Neural network size depends on the version that you are going to use.

YOLO代表您只看一次。 YOLO的输入图像大小为448 * 448 RGB图像。 YOLO体系结构有许多版本。 YOLO,YOLOv2,tiny-YOLO很少。 神经网络的大小取决于要使用的版本。

Image for post
fig 11 — YOLO
图11 — YOLO

External Ref —

外部参考—

翻译自: https://medium.com/analytics-vidhya/build-your-own-model-with-convolutional-neural-networks-5ca0dd222c8f

卷积神经网络模型建立

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值