【NasNet】《Learning Transferable Architectures for Scalable Image Recognition》

最新推荐文章于 2024-01-28 21:47:55 发布

bryant_meng

最新推荐文章于 2024-01-28 21:47:55 发布

阅读量595

点赞数 3

分类专栏： CNN / Transformer 文章标签： Auto-ML

本文链接：https://blog.csdn.net/bryant_meng/article/details/81163862

版权

CNN / Transformer 专栏收录该内容

211 篇文章 7 订阅

订阅专栏

在这里插入图片描述
CVPR-2018

1 Background and Motivation

Classification models often requires significant architecture engineering.

作者提出直接 learn the model architectures on the dataset of interest.

但是很吃资源

所以作者现在小数据集（CIFAR-10）上 search for an architecture block，然后 transfer 到大数据集（ImageNet）上

2 Advantages / Contributions

实现从手工设计网络结构（human-invented models / engineered architectures / human-designed architectures）到手工设计探索网络结构的方法
CIFAR-10、ImageNet、COCO上都胜于 state-of-the-art（CIFAR-10 上 search，transform 到 ImageNet）
比轻量级网络 MobileNet、shuffleNet 效果好（哈哈，MobileNet、Shuffle v2 又怼回去了，参考【MobileNet V2】《MobileNetV2：Inverted Residuals and Linear Bottlenecks》、【ShuffleNet V2】《ShuffleNet V2：Practical Guidelines for Efficient CNN Architecture Design》）

NASNet search space

搜 best cell 而不是 best architecture

faster
更容易 generalize to other problems

3 Method

The design of our search space took much inspiration from LSTM, and Neural Architecture Search （NAS）Cell.

NAS 的结构如下
在这里插入图片描述
1）作者相比于 NAS 的改进如下

2）主要是 search two types of convolutional cells

Normal Cell
Reduction Cell（feature map 减半，channels double，结构同 Normal，只是输入到 cell 的第一个操作的 stride = 2）

感受下适用于 CIFAR-10 和 ImageNet 的整体结构
在这里插入图片描述
3）搜索过程原理图

可以看到，生成的新的 feature map，也会被加入到 hidden state set

4）controller RNN
在这里插入图片描述

step 1~5，由 5 个 softmax classifier 来裁决。

一个 cell 来 B 次 step 1~5，实验发现 B=5 效果最好

step 3，4 的候选操作如下
在这里插入图片描述
step 5 候选操作

element-wise addition
concatenation

4 Experiments

Proximal Policy Optimization（PPO）来 train controller RNN，500 NVidia P100s 4 days for CIFAR-10

NASNet-A
在这里插入图片描述

4.1 Datasets

CIFAR-10
ImageNet
COCO

4.2 CIFAR-10

在这里插入图片描述
cutout data augmentation，图 2 中 N = 7 的时候效果最好

4.3 ImageNet

没有 residual connection

在这里插入图片描述
更少的 parameters 和 computation，更高的 accuracy

看看在限制的计算量下的结果，精度比 mobileNet、shuffle 更好，说明参数利用率更高

4.4 COCO

NASNet + Faster RCNN pipeline
在这里插入图片描述
These results provide further evidence that NASNet provides superior, generic image features that may be transferred across other computer vision tasks.

在这里插入图片描述
能得到更精确的 localization

4.5 Efficiency of architecture search methods

在这里插入图片描述

reinforcement learning vs random search 也即
sample the decisions from the softmax classifiers vs sample the decisions from the uniform distribution

brute-force random search

感受下 NASNet-B 和 NASNet-C 的结构 for CIFAR-10 and ImageNet.（NASNet-A最好）
在这里插入图片描述