[深度学习论文笔记][Attention] Spatial Transformer Networks

最新推荐文章于 2024-06-11 09:30:20 发布

Hao_Zhang_Vision

最新推荐文章于 2024-06-11 09:30:20 发布

阅读量3k

点赞数 1

本文链接：https://blog.csdn.net/Hao_Zhang_Vision/article/details/53178265

版权

Jaderberg, Max, Karen Simonyan, and Andrew Zisserman. “Spatial transformer networks.” Advances in Neural Information Processing Systems. 2015. (Citations: 116).

1 Motivation

The Show, Attend and Tell only allow attention constrained to fixed grid. We want the model can attend to arbitary part of the image.

The pooling operation allows a network to be somewhat spatially invariant to the position of features. However, due to the typically small spatial support for max-pooling, this

spatial invariance is only realised over a deep hierarchy of max-pooling and convolutions, and the intermediate feature maps in a CNN are not actually invariant to large transformations of the input data.

Our goal is to introduce a spatial transformer module, which intelligently select features of interest (attention), and transform them by scaling, cropping, rotations, and non-rigid

deformations.

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

Hao_Zhang_Vision

关注关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
[深度学习论文笔记][Attention] Spatial Transformer Networks

Jaderberg, Max, Karen Simonyan, and Andrew Zisserman. “Spatial transformer networks.” Advances in Neural Information Processing Systems. 2015. (Citations: 116).1 MotivationThe Show, Attend and
复制链接

扫一扫