NAS论文阅读笔记（MobileNetV3）

最新推荐文章于 2023-10-25 21:09:15 发布

bfluss

最新推荐文章于 2023-10-25 21:09:15 发布

阅读量422

点赞数

分类专栏： NAS 文章标签：深度学习机器学习 python 人工智能

本文链接：https://blog.csdn.net/qq_38707467/article/details/105992385

版权

NAS 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Searching for MobileNetV3

论文连接：https://arxiv.org/abs/1905.02244

这篇文章将search techniques和a novel architecture design结合设计了MobileNetV3-Large 和MobileNetV3-Small。

主要引进了

(1) complementary search techniques,
(2) new ef- ficient versions of nonlinearities practical for the mobile set- ting,
(3) new efficientnetwork design,
(4) a new efficient segmentation decoder

Efficient Mobile Building Blocks

MobileNetV3利用MobileNetV2 + Squeeze-and-Excite层构成building blocks去构建模型，并且使用改良的swish，squeeze andexcitation和swish都用了sigmoid函数，sigmoid既不便于计算又难以保持准确性，所以用hard-sigmoid来代替sigmoid函数。
在这里插入图片描述

Network Search

用platform-aware NAS通过优化每一个网络块搜索全局网络结构，然后通过NetAdapt 算法搜索每一层的filter的数量。

1.Platform-Aware NAS for Block-wise Search
使用和MnasNet一样的RNN-based controller和factorized heerarchical search space，在80ms的延时约束下为large mobile models找到了与MnasNet相似的结果，因此直接使用了MnasNet-A1作为初始Large mobile model，然后在此基础上应用NetAdapt和其他的优化方法。
作者发现对于在对small models的延时，accuracy变化的更剧烈，所以将MnasNet中的w = -0.07改为了w=-0.15，去补偿大的accuracy 变化，然后从头开始进行架构搜索得到新的初始seed model。然后对其其应用NetAdapt和其他的优化来得到最后的MobileNet-Small model。

2.NetAdapt for Layer-wise Search
NetAdapt允许以顺序的方式对单个层进行微调，而不是推断出粗糙但其全聚德体系结构。
这个技术的过程如下：
1.以platform-aware NAS找到的seed network architecture开始
2.对每个步骤：
1）生成一组新的proposals，每个proposal代表一个架构优化，这个优化的架构相对前一步至少减少δ延时。
2）对于每个proposal，我们使用前一步的预训练模型加载到新提出的架构中，截断并随机初始化确实的权重，微调每一个proposal T steps来得到一个粗略的accuracy估计
3）根据某种指标选择最佳proposal
3.重复前面的步骤知道满足latency目标
在NetAdapt中指标是最小化accuracy change，但是这篇文章是最小化latency change 和accuracy change的比例。
作者使用NetAdapt中对MobilenetV2的proposal generater，并且允许以下两种proposal：

Reduce the size of any expansion layer;
Reduce bottleneck in all blocks that share the same bottleneck size - to maintain residual connections.

在实验中T=10000并且δ=0.01|L|,L是seed model的延时。

Network Improvements

1.Redesigning Expensice Layers
将网络的最后几层作了如下修改以减少latency
在这里插入图片描述
对initial set of filter将非线性层用hard swish代替使filter的数量从32减小到了16，但是accuracy不变。

2.Nonlinearities
在这里插入图片描述
改成了

并且只在模型的后半部分使用h-swish。

3.Large squeeze-and-excite
用固定squeeze-and-excite的大小为expansion layer的通道数的四分之一。

网络模型
在这里插入图片描述

bfluss

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
NAS论文阅读笔记（MobileNetV3）

Searching for MobileNetV3这篇文章将search techniques和a novel architecture design结合设计了MobileNetV3-Large 和MobileNetV3-Small。主要引进了(1) complementary search techniques,(2) new ef- ficient versions of nonli...
复制链接

扫一扫

专栏目录