Fine-Grained Recognition with Automatic and Efficient Part Attention

最新推荐文章于 2022-08-31 15:22:37 发布

xiaoyushares

最新推荐文章于 2022-08-31 15:22:37 发布

阅读量416

点赞数

本文链接：https://blog.csdn.net/xiaoyushares/article/details/64116882

版权

论文出处：2016年CVPR

作者单位：Baidu Research

细粒度分类的挑战在于较小的类间差异VS较大的类间差异。因此解决这个问题的关键在于定位判别性的位置并提取pose-invariant 特征。本文提出了一种全卷积注意力模型（Fully Convolutional Attention Networks， FCANs）。此模型利用增强学习的框架（reinforcement learning framework）自适应地选取局部判别性的区别用于不同的细粒度领域。本文的主要优势在于以下四点

1）融合了三个元素：特征提取，视觉attention和细粒度分类一起训练，是一个end-to-end的模型；

2）使用弱监督的增强学习，并且不需要额外的局部标注信息（part annotation）；

3）全卷积网络提升了训练和测试速度；

4）贪心的奖励策略加速了收敛。

所提的FCANs包括三个元素：the feature component，the attention component, the classification component.

Feature map extraction:

Fully convolutional Part Attention:

这一部分的功能是通过计算basis convolutional feature maps 生成大量的part score maps来定位不同的区域。每一个score map 是有两个卷积层和一个空间softmax层构成。第一个卷积层利用64个3x3的kernel，第二个卷积层是一个3x3的kernel，得到的是一个single-channel的confidence map。空间softmax层将confidence map转化成概率。测试时，模型利用最高的概率对应的attention region作为part location。

Fine-Grained Classification：

The classification component contains a convolutional network for each part as well as the whole image. 每一个位置的分类网络都是一个全卷积层，followed by a softmax layer。最终的预测结果是所有individual 分类器得分的均值。

整个attention 问题可以看做是一个Markov Decision Process （MDP）During each time step of MDP,
the FCANs work as an agent to perform an action based on the observation and receives a reward. 在本文中，action对应着attention region的位置，observation对应着输入图像以及 the crops of the attention regions； reward对应着利用attention region获得分类得分。