[paper] 00036-Xception:Deep Learning with Depthwise Separable Convolutions

最新推荐文章于 2023-09-14 23:07:13 发布

六半

最新推荐文章于 2023-09-14 23:07:13 发布

阅读量213

点赞数

分类专栏：论文文章标签： Xception Depthwise separable convolutio

本文链接：https://blog.csdn.net/u013659598/article/details/102714851

版权

论文专栏收录该内容

19 篇文章 0 订阅

订阅专栏

Author: Francois Chollet ----Google---Keras作者、谷歌大脑

Key Words:

Depthwise Separable Convolutions-------

1. Introduction

1.1 The Inception Hypothesis

A single convolution kernel is tasked with simultaneously mapping cross-channel correlations and spatial correlations.

1.2 The continuum between convolutions and separable convolutions

Two minor differences between and “extreme” version of an Inception module and a depthwise separable convolution would be:

The order of the operations: depthwise separable convolutions as usually implemented (e.g. in TensorFlow) perform first channel-wise spatial convolution and then perform 1x1 convolution, whereas Inception performs the 1x1 convolution first.
The presence or absence of a non-linearity after the first operation. In Inception, both operations are followed by a ReLU non-linearity, however depthwise separable convolutions are usually implemented without non-linearities.

2. Prior work

CNN, in particular the VGG-16
Inception architecture
Depthwise separable convolutions
Residual connections

3. The Xception architecture

we make the following hypothesis: that the mapping of cross-channels correlations and spatial correlations in the feature maps of convolutional neural networks can be entirely decoupled.

In short, the Xception architecture is a linear stack of depthwise separable convolution layers with residual connections.

4 Experimental evaluation

4.1 The JFT dataset

JFT is an internal Google dataset for large-scale image classification dataset, first introduced by Hinton et al. in [5], which comprises over 350 million high-resolution images annotated with labels from a set of 17,000 classes. To evaluate the performance of a model trained on JFT, we use an

auxiliary dataset, FastEval14k.

4.2 Optimization configuration

PASS

4.3 Regularization configuration

PASS

4.4 Training infrastructure

4.5 Comparison with Inception V3

4.5.1 Classification performance