Convolution Neural Network

最新推荐文章于 2024-06-18 08:39:10 发布

klaas

最新推荐文章于 2024-06-18 08:39:10 发布

阅读量1.8k

点赞数

分类专栏：深度学习文章标签： CNN arithmetic

本文链接：https://blog.csdn.net/klaas/article/details/51817845

版权

深度学习专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Introduction
- Discrete convolutions
  - Features
  - denotations
- Pooling
  - denotations
Convolution arithmetic
Pooling arithmetic
Transposed convolution arithmetic

Introduction

Convolution neural network has gain its popularity recently due to its power with computer vision. Although CNNs have been used as early as the nineties to solve character recognition tasks (Le Cun et al., 1997), their current widespread application is due to much more recent work, when a deep CNN was used to beat state-of-the-art in the ImageNet image classification challenge(Krizhevsky et al., 2012).

Difference from multi-layer conception
1. A convolutional layer’s output shape is affected by the shape of its input as well as the choice of kernel shape, zero padding and strides.
2. CNNs also usually feature a pooling stage
3. transposed convolutional layers (also known as fractionally strided convolutional layers) have been employed in more and more work as of late

Discrete convolutions

Features

stored as multi-dimensional arrays.
feature one or more axes for which ordering matters (e.g., width and height axes for an image, time axis for a sound clip).
One axis, called the channel axis, is used to access different views of the data.
sparse (only a few input units contribute to a given output unit) and reuses parameters (the same weights are applied to multiple locations in the input).

denotations

$i_j$ : input size along axis $j$ ,
$k_j$ : kernel size along axis $j$ ,
$s_j$ : stride (distance between two consecutive positions of the kernel) along axis $j$ ,
$p_j$ : zero padding (number of zeros concatenated at the beginning and at the end of an axis) along axis $j$ .

Pooling

Pooling operations reduce the size of feature maps by using some function to summarize subregions, such as taking the average or the maximum value.

difference from discrete convolution:
replaces the linear combination described by the
kernel with some other function.

denotations

$i_j$ : input size along axis j,
$k_j$ : pooling window size along axis j,
$s_j$ : stride (distance between two consecutive positions of the pooling window) along axis j.

Convolution arithmetic

The discussion are based but not limited to the following c
onstraints:

2-D discrete convolutions ( $N = 2$ ),
square inputs ( $i_1 = i_2 = i$ ),
square kernel size ( $k_1 = k_2 = k$ ),
same strides along both axes ( $s_1 = s_2 = s$ ),
same zero padding along both axes ( $p_1 = p_2 = p$ ).

No zero padding, unit strides

Relationship: $o = (i - k) + 1$

Zero padding, unit strides

Relationship: $o = (i-k) + 2p + 1$

Half(same) padding

Sometimes, we just want the output size to be the same as the input size.

Relationship: For any $i$ and for $k$ odd( $k = 2n + 1, n \in N$ ), $s = 1$ and $p = [k/2] = n$ ,

o = i - k + 1 + 2 [k / 2] = i + 2 n - 2 n = i

$o = i - k +1 +2[k/2] =i + 2n - 2n = i$

Full padding

Sometimes, we just want to take every possible partial or complete superimposition of the kernel on the input feature map into consideration.

Relationship: For any $i$ and $k$ , and for $p = k-1$ and $s= 1$ ,

o = i + 2 (k - 1) - (k - 1) = i + k - 1

$o = i + 2(k-1) - (k-1) = i + k -1$

No zero padding, non-unit strides

Move with strides.

Relationship: For any $i, k$ and $s$ , and for $p = 0$ ,

o = [i - k s] + 1

$o = [\frac{i-k}{s}] + 1$

Zero padding, non-unit strides

With strides, some regions may not be covered by the kernel.

Relationship:

o = [i + 2 p - k s] + 1

$o = [\frac{i + 2p -k }{s}] + 1$

Pooling arithmetic

Pooling does not consider padding, so its output size is simple and the relationship will hold for any type of pooling.
Relationship:

o = [i - k s] + 1

$o = [\frac{i -k }{s}] + 1$

Transposed convolution arithmetic

Intuition: Arises form the desire of transformation from something that has the shape of the output of some convolution to something that has the shape of its input.
Usage: as decoding layer of a convolutional autoencoder or to project feature maps to a higher-dimensional space.

It has some similarities with direct convolution but some arithmetic changes!

ref : http://arxiv.org/pdf/1603.07285v1.pdf

klaas

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
Convolution Neural Network

Convolution neural network has gain its popularity recently due to its power with computer vision. Although CNNs have been used as early as the nineties to solve character recognition tasks (Le Cun et al., 1997
复制链接

扫一扫

专栏目录