![](https://img-blog.csdnimg.cn/20201014180756926.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
Transformer
文章平均质量分 92
Phoenixtree_DongZhao
深度学习 图像处理
展开
-
无监督显著目标检测论文阅读(一):光谱聚类投票方法 Unsupervised Salient Object Detectionwith Spectral Cluster Voting
本文旨在通过利用自监督特征的光谱聚类来解决无监督显著性目标检测 (SOD) 的任务。原创 2022-06-08 10:18:52 · 1449 阅读 · 0 评论 -
卷积网络重新反超 Transformer,ConvNeXt:A ConvNet for the 2020s
本文的主要思想是,将 Swin-Transformer 中使用的方方面面的技术使用在传统 ConvNet 上,来探讨这些技术是否能够在 ConvNet 上 work。结果发现是肯定的。原创 2022-01-29 00:11:04 · 6252 阅读 · 0 评论 -
又一个轻量级 ViT:Lite Vision Transformer with Enhanced Self-Attention
Lite Vision Transformer with Enhanced Self-Attentionhttps://arxiv.org/pdf/2112.10809.pdfAbstractDespite the impressive representation capacity of vision transformer models, current light-weight vision transformer models still suffer from inconsist.原创 2021-12-26 18:02:15 · 2241 阅读 · 0 评论 -
论文速读:FAIR 最新 ViT 模型 改进多尺度 ViT --- Improved Multiscale Vision Transformers
本文研究了多尺度 Vision Transformers (MViT) 作为一个统一的体系结构,进行图像和视频分类,以及目标检测。本文提出了一个改进的 MViT 版本,它包含了分解的相对位置 embeddings 和残差池化(residual pooling)连接。原创 2021-12-22 11:22:21 · 3598 阅读 · 0 评论 -
[NeurIPS 2021] TokenLearner:自适应学习 token 个数和位置 - What Can 8 Learned Tokens Do for Images and Videos?
本文介绍了一种新的视觉表征学习,它依赖于少量自适应学习的 tokens,适用于图像和视频的理解任务。原创 2021-12-16 20:22:18 · 2514 阅读 · 0 评论 -
完整阅读 何凯明最新一作:Masked Autoencoders Are Scalable Vision Learners
在自然语言处理 (NLP) 中,对数以百万计数据的应用,已经通过自监督的预训练模型 (如 BERT) 成功地解决了。本文提出的 masked autoencoders (MAE)是一种可扩展的计算机视觉自监督学习器。本文核心思想:对输入图像的随机块进行 mask,然后重建缺失的像素。本文的核心方法是,提出了一个非对称的编码器-解码器体系结构,发现 mask 输入图像的高比例,会产生一个重要的且有意义的自监督任务。原创 2021-11-25 06:39:22 · 3667 阅读 · 3 评论 -
轻量级 Vision Transformer - MobileViT
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision TransformerSachin Mehta Apple Mohammad RastegariAppleAbstractLight-weight convolutional neural networks (CNNs) are the de-factofor mobile vision task...原创 2021-10-14 14:43:20 · 3135 阅读 · 1 评论 -
论文阅读:ResMLP: Feedforward networks for image classification with data-efficient training
ResMLP: Feedforward networks for image classification with data-efficient trainingAbstractWe present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification.It is a simple residual network that alternates (i) a原创 2021-10-13 11:51:39 · 1447 阅读 · 0 评论 -
优于 ViT 和 MLP-Mixer 的全局滤波器:Global Filter Networks for Image Classification [NeurIPS 2021]
Global Filter Networks for Image Classification[pdf] [project] [github]AbstractRecent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer .原创 2021-10-12 16:48:14 · 3980 阅读 · 0 评论 -
一个挑战 ViT,MLP-Mixer 的新模型 ConvMixer:Patches Are All You Need? [Under Review ICLR 2022]
Convolutions Attention MLPs Patches are All Your Need?[OpenReview] [GitHub]本文看点:1. 本文原文非常短,只有 4 页多一点,整个模型也很简单,但它 挑战了 ViT 有效性的原因。2. 总结了最近特别火的 ViT,MLP-Mixer,ResMLP 等新构架之所以效果很好的共性。特斯拉 AI 高级总监 Andrej Karpathy 在推特上感叹道:我被新的 ConvMixer 架构震撼了。【(包括下原创 2021-10-09 11:14:28 · 2260 阅读 · 0 评论 -
Focal Self-attention for Local-Global Interactions inVision Transformers
Focal Self-attention for Local-Global Interactions in Vision TransformersJianwei Yang1, Chunyuan Li1, Pengchuan Zhang1, Xiyang Dai2, Bin Xiao2, Lu Yuan2, Jianfeng Gao11 Microsoft Research at Redmond, 2 Microsoft Cloud + AIhttps://arxiv.org/pdf/2107.0原创 2021-09-16 12:29:14 · 868 阅读 · 0 评论 -
MyDLNote-Transformer: Swin Transformer, 使用移位窗口的分层 Vision Transformer
Swin Transformer: Hierarchical Vision Transformer using Shifted Windowshttps://arxiv.org/pdf/2103.14030.pdfCode is available at https:// github.com/microsoft/Swin-Transformer.AbstractThis paper presents a new vision Transformer, called Swin Tra原创 2021-07-07 07:15:32 · 1325 阅读 · 2 评论 -
MyDLNote-Transformer : Pyramid Vision Transformer 一个无卷积的密集预测通用Backbone
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutionspaperhttps://arxiv.org/pdf/2102.12122.pdf Code is available at https://github.com/whai362/PVT Note:ImprovedPyramid Vision Transformer, PVTv2: Improved Baseli...原创 2021-07-05 19:29:48 · 666 阅读 · 1 评论 -
MyDLNote-Transformer: 局部和全局的 Transformer - Transformer in Transformer
Transformer in Transformerhttps://arxiv.org/pdf/2103.00112v1.pdfhttps://github.com/NZ99/transformer_in_transformer_flaxhttps://github.com/huawei-noah/noah-research/tree/ master/TNTAbstractTransformer is a type of self-attention-based neural原创 2021-07-04 14:19:52 · 1503 阅读 · 0 评论 -
MyDLNote-Transformer: 语义分割 Segmenter: Transformer for Semantic Segmentation
Segmenter: Transformer for Semantic Segmentation原创 2021-07-01 22:04:38 · 1672 阅读 · 1 评论 -
MyDLNote-Transformer(for Low-Level): Uformer: U 型 Transformer 图像修复
论文阅读之 - 用 Transformer 做图像修复Uformer: A General U-Shaped Transformerfor Image Restorationhttps://arxiv.org/pdf/2106.03106v1.pdfhttps://github.com/ZhendongWang6/UformerAbstractIn this paper, we present Uformer, an effective and efficient Transfor.原创 2021-06-20 08:08:25 · 1834 阅读 · 0 评论 -
最新 Visual Transformer 论文速览 (Attention Free Transformer,CeiT,DynamicViT)
1.When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations原创 2021-06-10 21:28:37 · 2033 阅读 · 1 评论 -
推荐必读 Vision Transformer 论文集
1. A Survey on Visual Transformer 【30 Jan 2021】原创 2021-06-07 21:58:31 · 756 阅读 · 0 评论 -
CVPR 2021 Visual Transformer 论文合集(附20篇推荐必读ViT论文)
CVPR 2021 视觉Transformer论文大盘点(43篇)AmusiCVer1周前点击下方卡片,关注“CVer”公众号AI/CV重磅干货,第一时间送达CVer一个专注侃侃计算机视觉方向的公众号。计算机视觉、图像处理、机器学习、深度学习、C/C++、Python、诗和远方等。198篇原创内容公众号作者:Amusi | 来源:CVer前言从2020下半年开始,特别是2021上半年,Visual Transformer的研究热点达到了前所未有...原创 2021-06-07 19:35:41 · 4741 阅读 · 0 评论