【论文阅读--WSSS】Extracting Class Activation Maps from Non-Discriminative Features as well

Nastu_Ho-小何同学

已于 2023-04-09 11:01:57 修改

阅读量382

点赞数

分类专栏：弱监督语义分割文章标签：论文阅读机器学习聚类

于 2023-04-09 10:45:22 首次发布

本文链接：https://blog.csdn.net/weixin_41845840/article/details/130039200

版权

弱监督语义分割专栏收录该内容

4 篇文章 0 订阅

订阅专栏

论文链接：https://arxiv.org/abs/2303.10334
代码链接：https

方法

一种即插即用的方法

在这里插入图片描述

all training images → images features

用backbone训一个多标签分类器
image features → clusters

注意：这篇文章是为每个类别单独进行聚类

先基于CAM用阈值处理得到前景背景。

$f(x)^{i,j}\in\begin{cases}\mathcal{F},&\textrm{if}\operatorname{CAM}_{n}^{i,j}(x)\geq\tau\\ \mathcal{B},&\textrm{otherwise}\end{cases}$

然后分别对前景-背景分别进行K-means得到前景-背景的cluster center

$\mathbf{F}=\{\mathbf{F}_1,\cdots,\mathbf{F}_K\}$ $\mathbf{B}=\mathring{\left\{\mathbf{B}_{1},\cdots,\mathbf{B}_{K}\right\}}$
clusters → local prototypes

由于CAM可能会激活不完整或者几乎不准确，前面简单的阈值处理后聚类得到的簇中心质量可能不一定好。这里则通过与分类器权重计算预测分数，来挑选出高质量的簇中心。

对于前景簇中心，要求选出具有高置信度的簇中心，这里对预测分数设置一个较高的阈值—比如0.9；挑选后的前景簇中心 $\mathbf{F}'=\{\mathbf{F}'_1,\cdots,\mathbf{F}'_{K'_1}\}$

$z_i=\dfrac{\exp\left(\mathbf{F}_i\cdot\mathbf{w}_n\right)}{\sum_j\exp\left(\mathbf F_i\cdot\mathbb{w}_j\right)}.$

对于背景簇中心，也是用类似的方法。但这里的阈值并不是选出一个很低的阈值比如0.1。考虑到CAM对共现背景（context）会误认为属于前景，这里选择的阈值为0.5，得到挑选后的背景簇中心 $\mathbf{B}'=\{\mathbf{B}'_1,\cdots,\mathbf{B}'_{K'_1}\}$ （后面通过前景map - 背景map来缓解共现背景误激活的问题）

$z_i=\dfrac{\exp\left(\mathbf B_i\cdot\mathbf w_n\right)}{\sum_j\exp\left(\mathbf B_{i}\cdot\mathbf w_j\right)}.$
local prototype → generate LPCAM

$\begin{aligned}FG_n&=\frac{1}{K_1'}\sum_{\text{F}_1'\in\mathbb{F}'}sin(f(x),\mathbf{F}_1')\text{,}\\ BG_n&=\frac1{K_2'}\sum_{\text{B}_2'\in\mathbf{B'}}sin(f(x),\textbf{B}_1'),\end{aligned}$

f(x)就是最后一层输出，没有经过pooling

这样就得到前景map和背景map，同时由于用余弦相似度计算，他们的值都被归一化到[-1, 1]之间。

$\operatorname{LPCAM}_{n}(x)=\dfrac{\operatorname{ReLU}\left(\boldsymbol{A}_{n}\right)}{\operatorname*{max}\left(\operatorname{ReLU}(A_{n})\right)}\\ \boldsymbol{A}_n=F\boldsymbol{G}_n-B\boldsymbol{G}_{n},$

这样，便得到当前类别的CAM了