【NLP】卷积神经网络基础

最新推荐文章于 2023-10-13 10:28:33 发布

Loewi大湿

最新推荐文章于 2023-10-13 10:28:33 发布

阅读量678

点赞数

分类专栏： self_learning 文章标签： CNN 卷积神经网络

本文链接：https://blog.csdn.net/weixin_42317507/article/details/89527256

版权

卷积神经网络基础

1. Convolution Layer
2. Transposed Convolutions
3. Pooling Layer
- 3.1 Max pooling
- 3.2 Average pooling
4. Fully Connected Layer
5. CNN Architectures
6. Text-CNN
7. Implementing a CNN for Text Classification in TensorFlow

在这里插入图片描述

The role of the ConvNet is to reduce the images into a form which is easier to process, without losing features which are critical for getting a good prediction.

我们今天就大概看三种layers：convolution Layer , pooling layer 和 fully connected layer.

1. Convolution Layer

1.1 Parameters:

Kernel Size, Stride, Padding, Input & Output Channels

- Kernel/Filter

在这里插入图片描述
如上动图：Kernel/Filter在这里是个3x3x1 matrix，每个channel选的kernel都不一样

Kernel/Filter, K = 
1  0  0
1  -1  -1
1  0  -1

卷积核大小：
卷积核通常使用奇数 3*3, 55, 77, 9*9, （为了对称）
通常小而深

- Stride

楼上kernel的Stride Length = 1 (Non-Strided)，如果是stride=2的就如下图：
（蓝色的表示input，阴影是kernel，青色的表示output）
在这里插入图片描述

- Padding

还有各种带padding的示意图，看这里here

a) Same Padding

在这里插入图片描述
（图：SAME padding: 5x5x1 image is padded with 0s to create a 6x6x1 image）

When we augment the 5x5x1 image into a 6x6x1 image and then apply the 3x3x1 kernel over it, we find that the convolved matrix turns out to be of dimensions 5x5x1. Hence the name — Same Padding.

b) Valid Padding

Iif we perform the same operation without padding, we are presented with a matrix which has dimensions of the Kernel (3x3x1) itself — Valid Padding.

1.2 Do the math:

· multiply, then add up：

简单来讲，看下图，黄的里面每一个格子和对应绿色里面的格子相乘，然后相加，得到的就就是Convolved feature
在这里插入图片描述
（图：Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature）
详细的见本页2.2 Convolution Matrix.

2. Transposed Convolutions

也叫 deconvolutions 或者 fractionally strided convolutions
对叫作deconvolution抱有怨念大有人在，如图：
在这里插入图片描述
为啥叫deconvolution，请找 Zeiler.

2.1 why use?

the need of up-sampling, 譬如从低像素到高像素，使用Transposed Convolutions就是个很好的办法。
在这里插入图片描述
参考

2.2 Intuition:

- Convolution Operation

回顾一下卷积运算：
在这里插入图片描述

input: 4x4 
stride: 1
kernel: 3x3
padding: 无
output: 2x2

这里，9个数变1个数，卷积运算是多对一的关系。

- Going Backward

在这里插入图片描述
如果我们想：

input: 2x2 
output: 4x4

也就是1个数变成9个数，一对多的关系。
我们首先需要弄明白Convolution Matrix和Transposed Convolution Matrix：

- Convolution Matrix

在这里插入图片描述
如下图，把 3x3 kernel变成4x16 matrix，把 4x4 input matrix变成16x1的column vector，
然后将 4x16 convolution matrix 与 16x1 input matrix (16 dimensional column vector)矩阵相乘，得到4x1 matrix

4x1 matrix进行变换就是2x2 matrix ：

为啥要弄成Convolution Matrix？
With the convolution matrix, you can go from 16 (4x4) to 4 (2x2) because the convolution matrix is 4x16. Then, if you have a 16x4 matrix, you can go from 4 (2x2) to 16 (4x4).

- Transposed Convolution Matrix

我们若想：

input: 2x2 
output: 4x4

我们要用到 16x4 matrix，这里要保证 1个数对应9个数的关系:

Transpose the convolution matrix C (4x16) to C^T (16x4).
Matrix-multiply C^T (16x4) with a column vector (4x1) to generate an output matrix (16x1)
The transposed matrix connects 1 value to 9 values in the output.

把结果变形就得到了4x4 matrix：

注意：the actual weight values in the matrix does not have to come from the original convolution matrix. What’s important is that the weight layout is transposed from that of the convolution matrix. here

tensorflow

tf.nn.conv2d_transpose(
    value,
    filter, 
    output_shape,
    strides,
    padding='SAME', 
    data_format='NHWC',
    name=None
)

3. Pooling Layer

In all cases, pooling helps to make the representation become approximately invariant to small translations of the input. Invariance to translation means that if we translate the input by a small amount, the values of most of the pooled outputs do not change. — Page 342, Deep Learning, 2016.

说白了就是化繁为简，等于downsampling：
在这里插入图片描述

作用1: decrease the computational power required to process the data through dimensionality reduction.
作用2: useful for extracting dominant features which are rotational and positional invariant, thus maintaining the process of effectively training of the model.

两种pooling：

3.1 Max pooling

字面意思，max()，如图

3.2 Average pooling

字面意思，average()，如图
在这里插入图片描述

4. Fully Connected Layer

在这里插入图片描述

Basically, a FC layer looks at what high level features most strongly correlate to a particular class and has particular weights so that when you compute the products between the weights and the previous layer, you get the correct probabilities for the different classes.

5. CNN Architectures

参考

5.1 Classic network architectures

- LeNet-5

在这里插入图片描述

看这里：Gradient-based learning applied to document recognition

- AlexNet

在这里插入图片描述
看这里：ImageNet Classification with Deep Convolutional Neural Networks

- VGG 16

在这里插入图片描述
看这里：Very Deep Convolutional Networks for Large-Scale Image Recognition

5.2 Modern network architectures

- Inception(GoogLeNet)

在这里插入图片描述
看这里: Going deeper with convolutions

最低0.47元/天解锁文章

Loewi大湿

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【NLP】卷积神经网络基础

卷积神经网络基础1. Convolution Layer1.1 Parameters:- Kernel- Stride- Padding- Same Padding- Valid Padding1.2 Do the math:· multiply, then add up：· then divide2. Transposed Convolutions3. Pooling Layer3.1 max ...
复制链接

扫一扫