深度学习2:卷积神经网络Convolutional Neural Network(基于Python MXNet.Gluon框架)

卷积神经网络概述

  卷积神经网络是一种具有局部连接权重共享等特性的深层前馈神经网络

  目前的卷积神经网络一般是由卷积层、汇聚层和全连接层交叉堆叠而成的前馈神经网络,使用反向传播算法进行训练。

  卷积神经网络有三个结构上的特性:局部连接权重共享以及汇聚。这些特性使得卷积神经网络具有一定程度上的平移、缩放和旋转不变性。和前馈神经网络相比,卷积神经网络的参数更少

卷积 Convolution

一维卷积:一维卷积的概念来自信号处理中,用于计算信号的延迟累积。一个信号发生器每个时刻 𝑡 产生一个信号 x t x_t xt,其信息的衰减率为 w k w_k wk,即在 k − 1 k − 1 k1 个时间步长后,信息为原来的 w k w_k wk 倍。

  例如,假设 w 1 = 1 , w 2 = 1 2 , w 3 = 1 4 w_1=1,w_2=\frac{1}{2},w_3=\frac{1}{4} w1=1,w2=21,w3=41 ,那么在时刻 t t t 收到的信号 y t y_t yt 为前时刻产生的信息和以前时刻延迟信息的叠加

y t = w 1 ⋅ x 1 + w 2 ⋅ x 2 + w 3 ⋅ x 3 = ∑ k = 1 3 w k x t − k + 1 . \begin{aligned} y_t = & w_1 \cdot x_1 + w_2 \cdot x_2 + w_3 \cdot x_3 \\ = & \sum_{k=1}^3 w_kx_{t-k+1}. \end{aligned} yt==w1x1+w2x2+w3x3k=13wkxtk+1.

滤波器 Filter/卷积核 Convolution Kernel w 1 , w 2 , ⋯ w_1,w_2,\cdots w1,w2,称为滤波器。

  假设滤波器长度为 𝐾 ,它和一个信号序列 x 1 , x 2 , ⋯ x_1 , x_2 , ⋯ x1,x2, 的卷积为

y t = ∑ k = 1 K w k x t − k + 1 . y_t = \sum_{k=1}^K w_k x_{t-k+1}. yt=k=1Kwkxtk+1.

信号序列 𝒙 和滤波器 𝒘 的卷积定义为:

y = w ∗ x . \textbf{y}=\textbf{w}*\textbf{x}. y=wx.

一般情况下滤波器长度 K K K 远小于信号序列长度 N N N。举一个一维卷积的例子:
在这里插入图片描述

  滤波器为 [ − 1 , 0 , 1 ] [-1,0,1] [1,0,1],图中红线表示权重乘以1,绿线表示权重乘以0,蓝线表示权重乘以-1,恰好是滤波器倒过来的顺序。权重和输入相乘后相加得到一个结果,比如前三个输入 [ 1 , 1 , 2 ] [1,1,2] [1,1,2] 和滤波器的卷积结果为 1 ⋅ 1 + 1 ⋅ 0 + 2 ⋅ ( − 1 ) = − 1 1 \cdot 1 +1 \cdot 0 + 2 \cdot (-1)=-1 11+10+2(1)=1,同理可以得出其他几个卷积结果。

二维卷积:卷积也经常用在图像处理中。因为图像为一个二维结构,所以需要将一维卷积进行扩展。给定一个图像 X ∈ R M × N \textbf{X} \in \mathbb{R}^{M \times N} XRM×N和滤波器 W ∈ R U × V \textbf{W} \in \mathbb{R}^{U \times V} WRU×V,一般 U U U 远小于 M M M V V V 远小于 N N N(对应一维情况下,滤波器长度 K K K 远小于信号序列长度 N N N),其卷积为

y i j = ∑ u = 1 U ∑ v = 1 V w u v x i − u + 1 , j − v + 1 y_{ij}=\sum_{u=1}^U \sum_{v=1}^V w_{uv} x_{i-u+1,j-v+1} yij=u=1Uv=1Vwuvxiu+1,jv+1

一个输入信息 X \textbf{X} X 和滤波器 W \textbf{W} W二维卷积定义为:

Y = X ∗ W . \textbf{Y}=\textbf{X}*\textbf{W}. Y=XW.

其中*表示二维卷积运算。举一个二维卷积的例子:
在这里插入图片描述
  输入信息为 [ 1 1 1 1 1 − 1 0 − 3 0 1 2 1 1 − 1 0 0 − 1 1 2 1 1 2 1 1 1 ] \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ -1 & 0 & -3 & 0 & 1 \\ 2 & 1 & 1 & -1 & 0 \\ 0 & -1 & 1 & 2 & 1 \\ 1 & 2 & 1 & 1 & 1 \end{bmatrix} 1120110112131111012111011,滤波器为 [ 1 0 0 0 0 0 0 0 − 1 ] \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & -1 \end{bmatrix} 100000001,将滤波器矩阵翻转180度得到 [ − 1 0 0 0 0 0 0 0 1 ] \begin{bmatrix} -1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} 100000001,然后从左到右、从上到下的与输入信息矩阵中的每一个三阶矩阵做元素加权和,得到二维卷积结果矩阵的值。

  以结果矩阵中第一行第一列的元素值为例, 0 = 1 ⋅ ( − 1 ) + 1 ⋅ 0 + 1 ⋅ 0 + ( − 1 ) ⋅ 0 + 0 ⋅ 0 + 0 ⋅ 0 + 2 ⋅ 0 + 1 ⋅ 0 + 1 ⋅ 1 0 = 1 \cdot (-1) +1 \cdot 0 +1 \cdot 0 + (-1) \cdot 0 + 0 \cdot 0 + 0 \cdot 0 + 2 \cdot 0 + 1 \cdot 0 + 1 \cdot 1 0=1(1)+10+10+(1)0+00+00+20+10+11;再以结果矩阵中第一行第三列的元素值(计算方法如图赋权)为例, − 1 = 1 ⋅ ( − 1 ) + 1 ⋅ 0 + 1 ⋅ 0 + ( − 3 ) ⋅ 0 + 0 ⋅ 0 + 1 ⋅ 0 + 1 ⋅ 0 + ( − 1 ) ⋅ 0 + 0 ⋅ 1 -1 = 1 \cdot (-1) +1 \cdot 0 +1 \cdot 0 + (-3) \cdot 0 + 0 \cdot 0 + 1 \cdot 0 + 1 \cdot 0 + (-1) \cdot 0 + 0 \cdot 1 1=1(1)+10+10+(3)0+00+10+10+(1)0+01,同理可以得出其他几个卷积结果。

  下图蓝色矩阵是输入信息,绿色矩阵就是通过 3 × 3 3 \times 3 3×3 卷积核作用的结果,旁边的虚线多了一圈,即补零p=1.

  在图像处理中,卷积经常作为特征提取的有效方法. 一幅图像在经过卷积操作后得到结果称为特征映射(Feature Map)。下图给出在图像处理中几种常用的滤波器,以及其对应的特征映射。图中最上面的滤波器是常用的高斯滤波器,可以用来对图像进行平滑去噪;中间和最下面的滤波器可以用来提取边缘特征。

互相关 Cross-Correlation

  在计算卷积的过程中,需要进行卷积核翻转。在具体实现上,一般会以互相关操作来代替卷积,(从上到下、从左到右)从而会减少一些不必要的操作或开销。

互相关:给定一个图像 X ∈ R M × N \textbf{X} \in \mathbb{R}^{M \times N} XRM×N和滤波器 W ∈ R U × V \textbf{W} \in \mathbb{R}^{U \times V} WRU×V,它们的互相关为

y i j = ∑ u = 1 U

  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
LazyProgrammer, "Convolutional Neural Networks in Python: Master Data Science and Machine Learning with Modern Deep Learning in Python, Theano, and TensorFlow" 2016 | ASIN: B01FQDREOK | 52 pages | EPUB | 1 MB This is the 3rd part in my Data Science and Machine Learning series on Deep Learning in Python. At this point, you already know a lot about neural networks and deep learning, including not just the basics like backpropagation, but how to improve it using modern techniques like momentum and adaptive learning rates. You've already written deep neural networks in Theano and TensorFlow, and you know how to run code using the GPU. This book is all about how to use deep learning for computer vision using convolutional neural networks. These are the state of the art when it comes to image classification and they beat vanilla deep networks at tasks like MNIST. In this course we are going to up the ante and look at the StreetView House Number (SVHN) dataset - which uses larger color images at various angles - so things are going to get tougher both computationally and in terms of the difficulty of the classification task. But we will show that convolutional neural networks, or CNNs, are capable of handling the challenge! Because convolution is such a central part of this type of neural network, we are going to go in-depth on this topic. It has more applications than you might imagine, such as modeling artificial organs like the pancreas and the heart. I'm going to show you how to build convolutional filters that can be applied to audio, like the echo effect, and I'm going to show you how to build filters for image effects, like the Gaussian blur and edge detection. After describing the architecture of a convolutional neural network, we will jump straight into code, and I will show you how to extend the deep neural networks we built last time with just a few new functions to turn them into CNNs. We will then test their performance and show how convolutional neu
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值