经典cnn之mobilenet

最新推荐文章于 2022-11-22 23:29:27 发布

qq_32110859

最新推荐文章于 2022-11-22 23:29:27 发布

阅读量477

点赞数

分类专栏：深度学习模型

本文链接：https://blog.csdn.net/qq_32110859/article/details/86552188

版权

深度学习模型专栏收录该内容

4 篇文章 0 订阅

订阅专栏

https://arxiv.org/pdf/1704.04861.pdf

摘要
1. mobilenet是一个流线型结构，用了depthwise separable convolutions
2. 用了两个全局超参，在耗时和准确性之间做了trade off，width multiplier and resolution multiplier
prior work
1. 开发者可以根据resource的限制来选择一个小模型
2. MobileNets primarily focus on optimizing for latency but also yield small networks.
3. 小模型：
  1. MobileNets are built primarily from depthwise separable convolutions initially introduced in [26] and subsequently used in Inception models [13] to reduce the computation in the first few layers.
    1. 模型是在depthwise separable convolutions基础上建立的，之后用在了inception模型中
  2. Flattened networks [16] build a network out of fully factorized convolutions and showed the potential of extremely factorized networks. Independent of this current paper, Factorized Networks[34] introduces a similar factorized convolution as well as the use of topological connections.
  3. the Xception network [3] demonstrated how to scale up depthwise separable filters to out perform Inception V3 networks.
  4. Squeezenet [12] which uses a bottleneck approach to design a very small network.
  5. Other reduced computation networks include structured transform networks [28] and deep fried convnets [37].
mobilenet结构
1. depthwise separable filters
  1. depthwise separable convolution is a form of factorized convolutions which factorize a standard convolution into a depthwise convolution and a 1×1 convolution called a pointwise convolution.
    1. a depthwise convolution
      1. 每一个input channel只有一个filter，标准卷积是有output channel个filter
    2. a pointwise convolution
      1. 1*1filter 将depthwide convolution的输出combine起来
    3. 标准的卷积
      1. filter和combination一起完成，depthwise separable filter分开做了这两步
  2. 这种操作显著降低了计算量和模型大小:
  3. $\\\textup{standard convolution: } \\ \\input(Df*Df*M)\overset{Dk*Dk*M*N}{\rightarrow}output(Dg*Dg*N)\\ \\\textup{computational cost : } Dk*Dk*Df*Df*M*N\\$
    $\\\textup{depthwise convolution :}\\ \\\textup{computational cost : } Dk*Dk*Df*Df*1*M\\$
    $\\\textup{pointwise convolution :}\\ \\\textup{computational cost : } 1*1*Df*Df*M*N\\$
    ${\tfrac{Dk*Dk*Df*Df*M+Df*Df*M*N}{Dk*Dk*Df*Df*M*N}} = {\tfrac{1}{N}}+{\tfrac{1}{Dk^2}}$
2. 模型结构及训练
  1. 所有层都加了BN以及RELU
  2. 模型结构：
  3. 1*1卷积可以用GEMM优化
  4. mobilenet中的优化：
    1. Our model structure puts nearly all of the computation into dense 1 × 1 convolutions. This can be implemented with highly optimized general matrix multiply (GEMM) functions. Often convolutions are implemented by a GEMM but require an initial reordering in memory called im2col in order to map it to a GEMM. For instance, this approach is used in the popular Caffe package [15].
    2. 1×1 convolutions do not require this reordering in memory and can be implemented directly with GEMM which is one of the most optimized numerical linear algebra algorithms.
    3. MobileNet spends 95% of it’s computation time in 1 × 1 convolutions which also has 75% of the parameters as can be seen in Table 2. Nearly all of the additional parameters are in the fully connected layer.
  5. 模型训练所用优化器为g RMSprop with asynchronous gradient descent，没看懂
3. Width Multiplier: Thinner Models
  1. depthwise separable filters输入输出通道数都乘以Width Multiplier
4. Resolution Multiplier: Reduced Representation
  1. model中的所有层的长宽都同时乘以Resolution Multiplier
  2. $\rho * (Df*Df) * Dk*Dk* \alpha *M + \alpha * (M * N) * \rho *(Df*Df)$
experiment
1. model choice
  1. depthwise separable convolutions vs full convolution
  3. thinner model vs shallower model
  4. We next show results comparing thinner models with width multiplier to shallower models using less layers. To make MobileNet shallower, the 5 layers of separable filters with feature size 14 × 14 × 512 in Table 1 are removed. Table 5 shows that at similar computation and number of parameters, that making MobileNets thinner is 3% better than making them shallower.
2. Model Shrinking Hyperparameters

...之后便是一些模型与现有模型结果比较

qq_32110859

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
经典cnn之mobilenet

https://arxiv.org/pdf/1704.04861.pdf摘要 mobilenet是一个流线型结构，用了depthwise separable convolutions 用了两个全局超参，在耗时和准确性之间做了trade off，width multiplier and resolution multiplier prior work 开发者可以根据resource...
复制链接

扫一扫

专栏目录


	$\\\textup{standard convolution: } \\ \\input(DfDfM)\overset{DkDkMN}{\rightarrow}output(DgDgN)\\ \\\textup{computational cost : } DkDkDfDfMN\\$
	$\\\textup{depthwise convolution :}\\ \\\textup{computational cost : } DkDkDfDf1*M\\$
	$\\\textup{pointwise convolution :}\\ \\\textup{computational cost : } 11DfDfM*N\\$
	${\tfrac{DkDkDfDfM+DfDfMN}{DkDkDfDfMN}} = {\tfrac{1}{N}}+{\tfrac{1}{Dk^2}}$

经典cnn之mobilenet

“相关推荐”对你有帮助么？