[ResMLP]ResMLP: Feedforward networks for image claissification with data-efficient training

最新推荐文章于 2024-06-26 00:07:25 发布

Ah丶Weii

最新推荐文章于 2024-06-26 00:07:25 发布

阅读量571

点赞数

分类专栏：学习

本文链接：https://blog.csdn.net/weixin_43823854/article/details/116714741

版权

ResMLP是一种纯多层感知机架构，用于图像分类，它不需要基于批次或通道统计的规范化。文章介绍了ResMLP的架构，包括线性层和残差多感知机层，以及其与Transformer的关系。实验表明，ResMLP在仅使用ImageNet-1k训练的情况下，能实现良好的精度和复杂度平衡，通过知识蒸馏方法可进一步提升性能。

摘要由CSDN通过智能技术生成

1. Contribution

本文提出Residual Multi-Layer Perceptrons (ResMLP)

We propose Residual Multi-Layer Perceptrons (ResMLP): a purely multi-layer perceptron (MLP) based architecture for image classification.

(i) a linear layer in which image patches interact, independently and identi- cally across channels,

and (ii) a two-layer feed-forward network in which channels interact independently per patch.

如图1是ResMLP architecture。简要流程如下，分为网络的输入，2个残差结构，分别是linear layer以及MLP with a single hidden layer，最后是一个average pool layer和a linear classifier：

it takes flattened patches as input, projects them with a linear layer, and sequentially updates them in turn with two residual operations:
(i) a simple linear layer that provides interaction between the patches, which is applied to all channels independently;
(ii) an MLP with a single hidden layer, which is independently applied to all patches.
At the end of the network, the patches are average pooled, and fed to a linear classifier.

2. Summary

ResMLP可以在只是用ImageNet-1k训练的情况下，得到一个精度和时间复杂度平衡。

Despite their simplicity, Residual Multi-Layer Perceptrons can reach surprisingly good accuracy/complexity trade-offs with ImageNet-1k training only1, without requiring normalization based on batch or channel statistics;

通过蒸馏方法可以获得收益。

These models benefit significantly from distillation methods.

linear层的设计，有助于观察到网络通过层与层之间学习何种空间交互信息。

thank to its design where patch embeddings simply “communicate” through a linear layer, we can make observations on what kind of spatial interaction the network learns across layers.