细粒度分类：Diversification Block(DB) + Gradient-boosting Cross Entropy(GCE)（一）

最新推荐文章于 2024-04-28 17:38:59 发布

Robust Da

最新推荐文章于 2024-04-28 17:38:59 发布

阅读量1k

点赞数

分类专栏：细粒度分类FGVC 文章标签： pytorch cnn 深度学习 python

本文链接：https://blog.csdn.net/dazheng121/article/details/124390171

版权

细粒度分类FGVC 专栏收录该内容

6 篇文章 5 订阅

订阅专栏

本文介绍了如何实现和理解DiversificationBlock，一种用于抑制特征图中显著区域的模块，通过随机峰抑制和块级抑制增强网络的泛化能力。代码示例展示了利用PyTorch实现的DB及其在特征图上的操作过程。

摘要由CSDN通过智能技术生成

前言

本文记录了学习DB的过程，并给出了DB的基于Pytorch的代码。

一、参考论文

Fine-grained Recognition: Accounting for Subtle Differences between Similar Classes
读完论文还是觉得文章中 “抑制最显著特征迫使网络学习其他特征” 的思想很有启发性，但是遗憾的是文章对应的代码没有开源。

二、Diversification Block简介

DB(diversification block)的作用是抑制显著性最强的区域，迫使网络学习其他部位。
在这里插入图片描述
Peak Suppression随机抑制高峰位置，对于特征图最大的位置，使用伯努利分布来判断是否遮罩。

（论文中的Ppeak=1意思是一定抑制吗？）

Patch Suppression的作用是还有一些虽然不是高峰值但是也是值得抑制的部位，所以先对特征图分patch，然后同样按伯努利随机的对每个patch进行抑制。

最后通过Activation Suppression决定是否抑制和抑制强度,公式如下。
在这里插入图片描述

三、代码实现

1、代码参考

https://github.com/JerryMazeyu/fine-grained2019AAAI

2、问题（个人理解）

参考代码中，并没有用到pk（也就是论文中的Ppeak），同时也没有体现出论文中的Ppatch以及伯努利随机概率。

3、具体实现

import os
project_index = os.getcwd().find('fine-grained2019AAAI')
root = os.getcwd()[0:project_index] + 'fine-grained2019AAAI'
import sys
sys.path.append(root)
import torch
from torch import nn
import numpy as np

class DiversificationBlock(nn.Module):

    def __init__(self, pk=0.5, r=3, c=4):
        """
        实现论文中的diversificationblock, 接受一个三维的feature map，返回一个numpy的列表，作为遮罩
        :param pk: pk是bc'中随机遮罩的概率
        :param r: bc''中行分成几块
        :param c: bc''中列分成几块
        """
        super(DiversificationBlock, self).__init__()
        self.pk = pk
        self.r = r
        self.c = c

    def forward(self, feature_maps):
        def helperb1(feature_map):
            row, col = torch.where(feature_map == torch.max(feature_map))
            print(row, col)
            b1 = torch.zeros_like(feature_map)
            for i in range(len(row)):
                r, c = int(row[i]), int(col[i])
                b1[r, c] = 1
            return b1

        def from_num_to_block(mat, r, c, num):
            assert len(mat.shape) == 2, ValueError("Feature map shape is wrong!")
            res = np.zeros_like(mat)
            row, col = mat.shape
            block_r, block_c = int(row / r), int(col / c)
            index = np.arange(r * c) + 1
            index = index.reshape(r, c)
            index_r, index_c = np.argwhere(index == num)[0]
            if index_c + 1 == c:
                end_c = c + 1
            else:
                end_c = (index_c + 1) * block_c
            if index_r + 1 == r:
                end_r = r + 1
            else:
                end_r = (index_r + 1) * block_r
            res[index_r * block_r: end_r, index_c * block_c:end_c] = 1
            return res

        if len(feature_maps.shape) == 3:
            resb1 = []
            resb2 = []
            feature_maps_list = torch.split(feature_maps, 1)
            for feature_map in feature_maps_list:
                feature_map = feature_map.squeeze()
                tmp = helperb1(feature_map)
                resb1.append(tmp)
                tmp1 = from_num_to_block(feature_map, self.r, self.c, 3)
                resb2.append(tmp1)

        elif len(feature_maps.shape) == 2:
            tmp = helperb1(feature_maps)
            tmp1 = from_num_to_block(feature_maps, self.r, self.c, 3)
            resb1 = [tmp]
            resb2 = [tmp1]

        else:
            raise ValueError
        res = [np.clip(resb1[x].numpy() + resb2[x], 0, 1) for x in range(len(resb1))]
        return res

if __name__ == '__main__':
    feature_maps = torch.rand([3,3,4])
    print("feature maps is: ", feature_maps)
    db = DiversificationBlock()
    res = db(feature_maps)
    print(res, len(res))

四、注解

torch.split(feature_maps, 1)

将feature_maps按照维度dim=1切分为dim_num个tensor。

torch.split(tensor, split_size_or_sections, dim=0)
tensor：input，待切分的输入
split_size_or_sections：需要切分的大小(int or list )
dim：切分维度
output：切分后块结构 <class ‘tuple’>

numpy.squeeze(a,axis = None)

可以删除数组形状中的单维度条目，即把shape中为1的维度去掉，但是对非单维的维度不起作用。

torch.where(feature_map == torch.max(feature_map))

求出feature_map中最大值的坐标。

torch.where(condition, x, y) → Tensor
根据条件，返回从x,y中选择元素所组成的张量。如果满足条件，则返回x中元素。若不满足，返回y中元素。
torch.max(input, dim)
input是softmax函数输出的一个tensor，dim是max函数索引的维度0/1，0是每列的最大值，1是每行的最大值
函数会返回两个tensor，第一个tensor是每行的最大值；第二个tensor是每行最大值的索引。

b1 = torch.zeros_like(feature_map)

torch.zeros_like()
生成和括号内变量维度维度一致的全是零的内容。

index_r, index_c = np.argwhere(index == num)[0]

得到第一个与num相等的index值的坐标位置。

numpy.argwhere(a)

>>> x = np.arange(6).reshape(2,3)
>>> x
array([[0, 1, 2],
       [3, 4, 5]])
>>> np.argwhere(x>1)
array([[0, 2],
       [1, 0],
       [1, 1],
       [1, 2]])