CAM的三个阶段

hahajing369

已于 2022-12-21 01:12:32 修改

阅读量563

点赞数 1

分类专栏： CAM 文章标签：深度学习计算机视觉神经网络

于 2021-07-01 17:08:29 首次发布

本文链接：https://blog.csdn.net/jingtingxu369/article/details/118391006

版权

CAM 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

三大组成部分，feature map2048、prototype每个channel的重要程度、GAP
feature map * prototype就是CAM的结果，CAM的结果中有些数值特别大，300多都有

默认feature map能定位出次判别区域，只是prototype选不上，设计多组prototype
默认feature map就展示不出次判别区域，把强判别区域抹掉/抑制，在原图上抹去/feature map上抑制
对feature map进行修正，例如加一个前景分支和feature map相乘

但凡可视化出来的都是值在0-1之间，你想想值是389对应的有颜色吗？没有。特征图和CAM都归一化了

feature map中没有，即便w再大再小都没用。所以让feature map中有次判别区域，才有可能出现次判别，f [f>0.7]=0.7

self.classifier = nn.Conv2d(2048, (self.num_cls)*8, 1, bias=False)，多个prototype

self.classifier = nn.Conv2d(2048, (self.num_cls)*8, 1, bias=False)
    
def forward(self, x, valid_mask):

        N, C, H, W = x.size()

        # forward
        x0 = self.stage0(x)
        x1 = self.stage1(x0)
        x2 = self.stage2(x1).detach()
        x3 = self.stage3(x2)
        x4 = self.stage4(x3)

        
        cam = self.classifier(x4)
        #########newly  initial cam
        batch_size, cc, hh, ww = cam.size()
        cam_multimaps = cam.view(batch_size, 20, 8, hh, ww)
        cam = torch.sum(cam_multimaps, 2) / 8
        score = F.adaptive_avg_pool2d(cam, 1)

高亮一点点红，然后送入GAP的值都很大，最大的都有351，经过实验验证过

prototype表示每个feature map channel的重要程度

self.classifier = nn.Conv2d(2048, 20, 1, bias=False)

def forward(self, x, valid_mask):

        N, C, H, W = x.size()

        # forward
        x0 = self.stage0(x)
        x1 = self.stage1(x0)
        x2 = self.stage2(x1).detach()
        x3 = self.stage3(x2)
        x4 = self.stage4(x3)

        
        20_channel_cam = self.classifier(x4)              
        #### 此处不加relu,不加的原因是一部分负数能抵消正数，有扩大判别区域的功效
        score = F.adaptive_avg_pool2d(20_channel_cam, 1)
        lossCLASS = F.multilabel_soft_margin_loss(score, label)

语义分割结果：

分类（PASCAL VOC 2012 train sets）

分类 →crf （PASCAL VOC 2012 train sets）

分类 →crf →IRN →deeplab （PASCAL VOC 2012 val and test sets）

PASCAL VOC2012 train set————CAM大约是resnet50&48.6%，Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation resnet50&58.6%
resnet38比resnet50的结果要好点
CRF后处理是很常见，能提升大约5%。CRF代码在ir net中的cam_to_ir_label.py中有写
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised
Semantic Segmentation

CAM可视化就是那种，但需要最终的分割结果，背景区域参与进来

背景区域的处理①ir net中直接认为是0.15，结果最好；same王玉德文章中和AdvCAM用遍历的方法0.15，0.16,0.17,0.18,0.19……

AdvCAM/run_sample.py at main · jbeomlee93/AdvCAM · GitHub，代码在106行

②背景区域的处理：使用saliency作为辅助

Download saliency maps used for background cues.
GitHub - qjadud1994/DRS: Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

这篇文章中是saliency

for idx, dat in tqdm(enumerate(val_loader)):
        img, label, sal_map, gt_map, _ = dat

        logit, cam = model(img, label)

        """ obtain CAMs """
        cam = cam.cpu().detach().numpy()
        gt_map = gt_map.detach().numpy()
        sal_map = sal_map.detach().numpy()


        """ segmentation label generation """
        cam[cam < 0.2] = 0                        # object cue
        bg = np.zeros((B, 1, H, W), dtype=np.float32)
        pred_map = np.concatenate([bg, cam], axis=1)  # [B, 21, H, W]


        pred_map[:, 0, :, :] = (1. - sal_map)     # background cue
        pred_map = pred_map.argmax(1)

        mIOU.add_batch(pred_map, gt_map)

③背景区域的处理：

论文Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation使用

Unlocking the Potential of Ordinary Classifier: Class-specific Adversarial
Erasing Framework for Weakly Supervised Semantic Segmentation

===============================================================

前20channel前景的处理，各种文章都有GT加进去，一部分channel全变成0

出自王玉德代码

cam = F.relu(cam)
max_v = torch.max(cam.view(N,C,-1),dim=-1)[0].view(N,C,1,1)
min_v = torch.min(cam.view(N,C,-1),dim=-1)[0].view(N,C,1,1)
cam  = F.relu(cam -min_v-e)/(max_v-min_v+e)

cam = cam * label
cam[:,0,:,:] = 1-torch.max(cam[:,1:,:,:],dim=1)[0]
cam[:,0,:,:] = 1-torch.max(cam[:,1:,:,:],dim=1)[0]


cam_max = torch.max(cam[:,1:,:,:], dim=1, keepdim=True)[0]
cam[:,1:,:,:][cam[:,1:,:,:] != cam_max] = 0

另外还加入了

cam[cam < 0.2] = 0

===============================================================

方法1，直接从20*128*128大小的feature map中得到CAM，和周博磊的不一样

代码实现分类的时候，feature map从2048通道变成20，不加relu，直接GAP，之后得到logit（20，1），直接送进F.multilabel_soft_margin_loss，代码如下。

因为F.multilabel_soft_margin_loss自带了sigmoid函数——变成0~1之间

loss = F.multilabel_soft_margin_loss(logit, label)

查找的关于“ F.multilabel_soft_margin_loss”的资料

方法2：

ir net和same程序都是三步走：

1.训练分类网络

2.make_cam

3.eval_cam（看分割的miou）

CAM的特性——过激活&欠激活：

针对小物体，覆盖全了，但会过度激活一些背景区域，过度激活会导致它们偏离对象边缘false-detection

针对大物体，就只能覆盖显著性区域，non-detection，就是false negatives (FNs)

RecurSeed and CertainMix for Weakly Supervised Semantic
Segmentation

出自《王宇德的文章》
Self-supervised Scale Equivariant Network for
Weakly Supervised Semantic Segmentation

hahajing369

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录