Paper Reading: MobileNets: Effective Convolutional Neural Networks for Mobile Vision Applications

最新推荐文章于 2022-03-27 10:47:56 发布

surtol

最新推荐文章于 2022-03-27 10:47:56 发布

阅读量107

点赞数

分类专栏： paper reading 文章标签： MobileNet

本文链接：https://blog.csdn.net/surtol/article/details/97265873

版权

paper reading 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

MobileNet v1
MobileNets: Effective Convolutional Neural Networks for Mobile Vision Applications

Section 1 & 2
Section 3
Section 4 & 5

Section 1 & 2

To build small and efficient neural network

compressing pretrained networks
train small networks directily

MobileNet is a class of network that works on restricted resources like latency and size.

Section 3

Depthwise Separable Convolution

Let’s assume that we have a input data $F$ of shape $D_F\times D_F \times M$ , a expected output data $G$ of shape $D_G\times D_G\times N$ , and a convolution kernel $K$ of shape $D_K \times D_K \times M$ .
For a standard convolution, to achieve the result, we have $N$ kernels names $K_1, K_2,...,K_N$ , and the computation cost is:
$\boxed{D_K \times D_K \times M} \times \boxed{D_G \times D_G } \times N$ Since $D_G \le D_F$ , the maximum computation cost is:
$\boxed{D_K \times D_K \times M} \times \boxed{D_F \times D_F } \times N$

name	shape
input $F$	$D_F \times D_F \times M$
output $G$	$D_G \times D_G \times N$
kernel $K_1,...,K_N$	$\boxed{ D_K \times D_K \times M } \times N$
maximum computation cost	$\boxed{D_K \times D_K \times M} \times \boxed{D_F \times D_F } \times N$

For depthwise separable convolution, the kernel is separated into 2 parts: $M$ depthwise convolution kernel of shape $D_K \times D_K \times 1$ and $N$ pointwise convolution kernels of shape $\times 1 \times M$ . Thus, the computation cost is:
$\boxed{D_K \times D_K \times 1} \times \boxed{D_G \times D_G } \times M + \boxed{1 \times 1 \times M} \times \boxed{D_G \times D_G } \times N$ Since $D_G \leq D_F$ , the maxinum computation is:
$\boxed{D_K \times D_K \times 1} \times \boxed{D_F \times D_F } \times M + \boxed{1 \times 1 \times M} \times \boxed{D_F \times D_F } \times N$
which is:
$D_K \times D_K \times D_F \times D_F \times M + D_F \times D_F \times N \times M$

name	shape
input $F$	$D_F \times D_F \times M$
output $G$	$D_G \times D_G \times N$
kernel $K$	$D_K \times D_K \times M$
pointwise convolution	$\boxed{1 \times 1 \times M} \times N$
maximum computation cost — depthwise part	$\boxed{D_K \times D_K \times 1} \times \boxed{D_F \times D_F } \times M$
maximum computation cost — pointwise part	$\boxed{1 \times 1 \times M} \times \boxed{D_F \times D_F } \times N$

Network Structure

structure of MobileNet v1

Shrinking hyper-parameters

Width multiplier $\alpha$
This hyper-parameter changes the input channel number to $M'=\alpha M$ and output channel number $N'=\alpha N$ . Thus, the maximum computation cost becomes:
$D_K \times D_K \times D_F \times D_F \times \alpha M + D_F \times D_F \times \alpha N \times \alpha M$ which is:
$\alpha \times D_K \times D_K \times D_F \times D_F \times M + \alpha^{2} \times D_F \times D_F \times N \times M$
Resolution multiplier $\rho$
This hyper-parameter changes the input resolution to $\rho D_F \times \rho D_F$ and output resolution to $\rho D_G \times \rho D_G$ , and maximum compution cost becomes:
$\rho D_K \times \rho D_K \times \rho D_F \times \rho D_F \times M + \rho D_F \times \rho D_F \times N \times M$ which is:
$\rho^{4} \times D_K \times D_K \times D_F \times D_F \times M + \rho^{2} \times D_F \times D_F \times N \times M$
When combine $\alpha$ and $\rho$ together, weget the final maximum conputation cost as follows:
$\rho D_K \times \rho D_K \times \rho D_F \times \rho D_F \times \alpha M + \rho D_F \times \rho D_F \times \alpha N \times \alpha M$ where $\alpha \leq 1, 0 < \rho \leq 1$ .

Section 4 & 5

results of MobileNet v1
From the shown results in Table 4, the MobileNet do reduce a great number of parameters with a little accuracy sacrificed. And the hyper-parameters $\alpha$ and $\rho$ performs well when larger than 0.5.
According the paper, MobileNet also perform well in other tasks.

surtol

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Paper Reading: MobileNets: Effective Convolutional Neural Networks for Mobile Vision Applications

MobileNet v1MobileNets: Effective Convolutional Neural Networks for Mobile Vision Applications文章目录
复制链接

扫一扫