卷积神经网络模型建立_用卷积神经网络建立自己的模型

最新推荐文章于 2024-07-22 13:42:36 发布

weixin_26704853

最新推荐文章于 2024-07-22 13:42:36 发布

阅读量965

点赞数

文章标签：神经网络深度学习 tensorflow 人工智能卷积神经网络

原文链接：https://medium.com/analytics-vidhya/build-your-own-model-with-convolutional-neural-networks-5ca0dd222c8f

版权

本文介绍了如何建立卷积神经网络模型，详细解析了从理论到实践的过程，帮助读者理解并应用卷积神经网络进行深度学习任务。

摘要由CSDN通过智能技术生成

卷积神经网络模型建立

A neural network is a series of algorithms that endeavors to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.

神经网络是一系列算法，旨在通过模仿人脑操作方式的过程来识别一组数据中的潜在关系。

什么是卷积神经网络 (What is a convolution neural network)

A convolutional neural network (CNN) is a type of artificial neural network used in image recognition and processing that is specifically designed to process pixel data.CNNs are powerful image processing, artificial intelligence that uses deep learning to perform both generative and descriptive tasks, often using machine vision that includes image and video recognition, along with recommender systems and natural language processing.

卷积神经网络(CNN)是一种专门用于图像识别和处理的人工神经网络，专门用于处理像素数据.CNN是功能强大的图像处理，人工智能，使用深度学习来执行生成和描述性任务，通常使用机器视觉，包括图像和视频识别，以及推荐系统和自然语言处理。

为什么我们需要卷积 (Why we need convolution)

Parameter sharing — feature detectors can be used in all over the image
参数共享-特征检测器可在整个图像中使用
Sparsity of connections — Each output value in only depend on small number of input values
连接的稀疏性—每个输出值仅取决于少量的输入值

如何做卷积 (How to do convolution)

Image for post — fig 1 — Convolution operation

Convolution is overlay the filter into the input and get the summation.

卷积将滤波器覆盖到输入中并获得总和。

大步走 (Stride)

Stride value is how much cells we are going to shift the filter to the right to get the next output value.

步幅值是我们要将滤波器向右移动以获得下一个输出值的单元格数量。

填充 (Padding)

Padding has two main benefits,

填充有两个主要好处，

Determine the border of the image
确定图像的边框
Use convolution without necessarily shrinking the height and width of the volumes.
使用卷积，而不必缩小卷的高度和宽度。

卷积 (Convolution over volume)

Number of filters determines how many channels will be on the output.

滤波器的数量决定了输出中将有多少个通道。

Other than convolutional layers CNN has pooling layers and activation layers.

除卷积层外，CNN还具有池化层和激活层。

汇聚层 (Pooling layer)

Pooling layer is used to reduce the size of the representations and to speed up calculations, as well as to make some of the features it detects a bit more robust.

池化层用于减小表示的大小并加快计算速度，并使它检测到的某些功能更健壮。

There are two main types of pooling layers.

有两种主要类型的池化层。

Max pooling — get the maximum value contained in the window
最大池化-获取窗口中包含的最大值
Avg pooling — get the average value from the window
平均池化-从窗口获取平均值

When you do the pooling, it doesn’t change the number of channels. It only reduces the width and the height.

进行池化时，它不会更改通道数。它仅减小宽度和高度。

激活功能层 (Activation function layer)

The purpose of the activation function is to introduce non-linearity into the output of a neuron.

激活函数的目的是将非线性引入神经元的输出中。

Simple convolution neural network consists of 3 main components.

简单卷积神经网络由3个主要部分组成。

Forward pass
前传
Final layer calculation
最终层计算
Backward pass
后退通行证

Forward pass is the process of calculating the output values from first layer to last layer. In the final layer, loss function is calculated using output values. Backward pass is the process of calculating the derivatives using loss function and updating the bias and weight values.

正向传递是计算从第一层到最后一层的输出值的过程。 在最后一层 ，使用输出值计算损失函数。 向后传递是使用损失函数计算导数并更新偏差和权重值的过程。

Well known architectures in CNN

CNN中的知名架构

经典网络：LeNet — 5 (Classic Network: LeNet — 5)

LeNet-5 is using 32*32 gray-scale image as its input. LeNet-5 consists with two convolution layers with average pooling followed by 2 fully connected layers. Finally a softmax layer to determine the output.

LeNet-5使用32 * 32灰度图像作为输入。 LeNet-5由两个具有平均池化的卷积层组成，然后是2个完全连接的层。最后是softmax层，以确定输出。

经典网络：AlexNet (Classic Network: AlexNet)

AlexNwt is using 227*227 RGB images as its input. Single RGB images consist of 3 channels. It has 5 convolution layers and 3 max pooling layers. Then it is followed by 3 fully connected layers. Finally a softmax layer to determine the output.

AlexNwt使用227 * 227 RGB图像作为输入。单个RGB图像包含3个通道。它具有5个卷积层和3个最大池化层。然后是3个完全连接的层。最后是softmax层，以确定输出。

经典网络：YOLO (Classic Network: YOLO)

YOLO stands for you only look once. Input image size of YOLO is 448*448 RGB images. YOLO architecture has many versions. YOLO, YOLOv2, tiny-YOLO are few of them. Neural network size depends on the version that you are going to use.

YOLO代表您只看一次。 YOLO的输入图像大小为448 * 448 RGB图像。 YOLO体系结构有许多版本。 YOLO，YOLOv2，tiny-YOLO很少。神经网络的大小取决于要使用的版本。