AAAI 2025 中最新的关于Mamba的研究

AAAI(Association for the Advancement of Artificial Intelligence,人工智能促进协会年会)是人工智能领域的顶级会议之一,将于2025 年 2 月 25 日至 3 月 4 日 在宾夕法尼亚州费城举办。

12月9日已经公布最终收录文献,在此博主汇总了其中与mamba有关的文献,供大家学习。

 SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks

SparX:分层视觉 Mamba 和 Transformer 网络的稀疏跨层连接机制

Abstract: Due to the capability of dynamic state space models (SSMs) in capturing long-range dependencies with linear-time computational complexity, Mamba has shown notable performance in NLP tasks. This has inspired the rapid development of Mamba-based vision models, resulting in promising results in visual recognition tasks. However, such models are not capable of distilling features across layers through feature aggregation, interaction, and selection. Moreover, existing cross-layer feature aggregation methods designed for CNNs or ViTs are not practical in Mamba-based models due to high computational costs. Therefore, this paper aims to introduce an efficient cross-layer feature aggregation mechanism for vision backbone networks. Inspired by the Retinal Ganglion Cells (RGCs) in the human visual system, we propose a new sparse cross-layer connection mechanism termed SparX to effectively improve cross-layer feature interaction and reuse. Specifically, we build two different types of network layers: ganglion layers and normal layers. The former has higher connectivity and complexity, enabling multi-layer feature aggregation and interaction in an input-dependent manner. In contrast, the latter has lower connectivity and complexity. By interleaving these two types of layers, we design a new family of vision backbone networks with sparsely cross-connected layers, achieving an excellent trade-off among model size, computational cost, memory cost, and accuracy in comparison to its counterparts. For instance, with fewer parameters, SparX-Mamba-T improves the top-1 accuracy of VMamba-T from 82.5\% to 83.5\%, while SparX-Swin-T achieves a 1.3\% increase in top-1 accuracy compared to Swin-T. Extensive experimental results demonstrate that our new connection mechanism possesses both superior performance and generalization capabilities on various vision tasks.
地址:arXiv:2409.09649

PoseMamba: Monocular 3D Human Pose Estimation with

### 关于 AAAI 2025 大会中的注意力机制 在探讨AAA I2025大会中与注意力机制相关的论文或议题时,可以预见会议将聚焦于该领域最新的进展和技术应用。近年来,深度学习模型已经变得越来越复杂和庞大[^1],这表明未来的研究将继续探索如何优化现有架构以及创建新的方法来处理更大规模的数据集。 对于注意力机制而言,在自然语言处理(NLP)方面取得了显著成就之后,研究者们正试图将其扩展到其他领域,比如计算机视觉、时间序列分析等。因此,在即将举行的AAAI 2025大会上可能会讨论如下主题: - **多模态数据融合下的改进型注意力算法**:随着多媒体信息的增长,能够有效结合文本、图像等多种形式输入的新型注意力建模成为热点之一。 - **轻量化注意力网络的设计与实现**:鉴于传统Transformer结构计算成本较高,针对资源受限环境(如移动设备)下高效运行的小型化版本将是重要方向。 ```python import torch.nn as nn class LightweightAttention(nn.Module): def __init__(self, d_model, num_heads=8): super(LightweightAttention, self).__init__() self.d_k = d_model // num_heads ... def forward(self, q, k, v, mask=None): ... ``` - **自适应动态调整权重策略**:使模型可以根据不同任务特点自动调节各部分的重要性程度,从而提高泛化能力和鲁棒性。 - **跨域迁移学习中的注意力引导**:利用源领域已有的知识指导目标领域的训练过程,特别是当两个领域间存在较大差异的情况下尤为关键。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值