Reading Note: MobileNets V2

最新推荐文章于 2023-11-20 10:32:36 发布

Joshua_Li_

最新推荐文章于 2023-11-20 10:32:36 发布

阅读量2k

点赞数

分类专栏：计算机视觉 DL CNN

本文链接：https://blog.csdn.net/joshua_1988/article/details/79487742

版权

计算机视觉同时被 3 个专栏收录

72 篇文章 0 订阅

订阅专栏

42 篇文章 0 订阅

订阅专栏

CNN

2 篇文章 0 订阅

订阅专栏

TITLE: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

AUTHOR: Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen

ASSOCIATION: Google

FROM: arXiv:1801.04381

CONTRIBUTION

The main contribution is a novel layer module: the inverted residual with linear bottleneck.

METHOD

BUILDING BLOCKS

Depthwise Separable Convolutions. The basic idea is to replace a full convolutional operator with a factorized version that splits convolution into two separate layers. The first layer is called a depthwise convolution, it performs lightweight filtering by applying a single convolutional filter per input channel. The second layer is a $1 \times 1$ convolution, called a pointwise convolution, which is responsible for building new features through computing linear combinations of the input channels.

Linear Bottlenecks Consider. It has been long assumed that manifolds of interest in neural networks could be embedded in low-dimensional subspaces. Two properties are indicative of the requirement that the manifold of interest should lie in a low-dimensional subspace of the higher-dimensional activation space:

If the manifold of interest remains non-zero vol-ume after ReLU transformation, it corresponds to a linear transformation.
ReLU is capable of preserving complete information about the input manifold, but only if the input manifold lies in a low-dimensional subspace of the input space.

Assuming the manifold of interest is low-dimensional we can capture this by inserting linear bottleneck layers into the convolutional blocks.

Inverted Residuals. Inspired by the intuition that the bottlenecks actually contain all the necessary information, while an expansion layer acts merely as an implementation detail that accompanies a non-linear transformation of the tensor, shortcuts are used directly between the bottlenecks. In residual networks the bottleneck layers are treated as low-dimensional supplements
to high-dimensional “information” tensors.

The following figure gives the Inverted resicual block. The diagonally hatched texture indicates layers that do not contain non-linearities. It provides a natural separation between the input/output domains of the building blocks (bottleneck layers), and the layer transformation – that is a non-linear function that converts input to the output. The former can be seen as the capacity of the network at each layer, whereas the latter as the expressiveness.

The framework of the work is illustrated in the following figure. The main idea of this work is to learn image aesthetic classification and vision-to-language generation using a multi-task framework.

Inverted Residuals

And the following table gives the basic implementation structure.

Bottleneck residual block

ARCHITECTURE

Architecture

PERFORMANCE

Classification

Object Detection

Semantic Segmentation

Joshua_Li_

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Reading Note: MobileNets V2

TITLE: Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and SegmentationAUTHOR: Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen...
复制链接

扫一扫