Less is More: Fewer Interpretable Region via Submodular Subset Selection (ICLR 2024, oral)

exploreandconquer

已于 2024-02-20 01:48:20 修改

阅读量1.4k

点赞数 31

分类专栏： Interpretability 文章标签：计算机视觉深度学习人工智能

于 2024-02-19 00:07:59 首次发布

本文链接：https://blog.csdn.net/Rad1ant_up/article/details/136154621

版权

Interpretability 专栏收录该内容

13 篇文章 3 订阅

订阅专栏

本篇文章发表于ICLR 2024 (oral)。

文章链接：https://arxiv.org/pdf/2402.09164.pdf

一、概述

为了更加合理、有效地探索人工智能并将其应用到现实世界，构建transparent and explainable的模型是十分关键的。在可解释人工智能领域，image attribution algorithm是一种典型的可解释方法，这种方法会针对图像产生对应的saliency maps that explain which image regions are more important to model decisions，可以提供对模型运算机制更深层次的理解。Image attribution algorithm主要分为两类：white-box methods/black-box methods；前者主要基于模型内部特征(internal characteristics)或者决策梯度(decision gradients)，后者则是通过external perturbations施加扰动后观察模型的响应变化。(其实这篇文章中提到的white-box都是post-hoc的可解释方法，但是能否叫做white-box，我持保留意见)

虽然已经有很多学者很好地研究了attribution algorithms，但是目前仍然存在两个局限：

"Some small attribution regions are not fine-grained enough which may interfere with the optimization orientation. Most advanced attribution algorithms focus on understanding how the global image contributes to the prediction results of the deep model while ignoring the impact of local regions on the results." ——简而言之就是saliency map太过粗糙，忽略了一些小的局部区域 (local regions) 对预测的贡献 (这种“忽略”也许甚至是两方面的，既有对正贡献的忽略也有对负贡献的误判，比如得到的saliency map中的regions不一定全部都对最终预测有正贡献)
"It is difficult to effectively attribute the latent cause of the error to samples with incorrect predictions. For incorrectly predicted samples, although the attribution algorithm can assign the correct class, it does not limit the response to the incorrect class, so that the attributed region may still have a high response to the incorrect class."——虽然saliency map可以揭示对correct class具有正贡献的regions，但是并没有限制其对incorrect class的响应。也就是说，一个潜在的危险是，这些regions有可能会对错误类别有更多的贡献。

为了解决以上问题，本篇文章提出了新的方法，将attribution问题reformulate成一个submodular subset selection problem。作者假设local regions can achieve better interpretability，并且希望能够通过尽量少的regions达到更高的interpretability。

作者首先将一幅图像分解为多个sub-regions，然后通过选择一定数量(fixed number)的sub-regions来实现最好的可解释性。
为了解决attribution region不充分、不细致(insufficient fine-grainiess)的问题，提出了"regional search"来持续地扩充sub-region set。
提出了一个新的attribution mechanism来揭示各个regions是从什么角度为模型提供了可解释性。具体来说，共有四个方面：(1) the prediction confidence of the model, 预测置信度；(2) the effectiveness of regional semantics, 有效性；(3) the consistency of semantics, 一致性；(4) the collective effect of the region, collaboration scores集体效应。
设计了一个submodular function来评估各个subset的重要性，以此来limit the search for regions with wrong class responses. 从而，对于正确预测的samples，提出的方法can obtain higher prediction confidence with only a few regions as input than with all regions as input；并且对于预测错误的samples，能够找到导致预测错误的具体原因。

方法在三个数据集上进行了evaluation，包括两个面部数据集Celeb-A，VGG-Face2和一个fine-grained数据集CUB-200-2011；对比方法是HSIC-Attribution (SOTA)；评价指标是Insertion scores和Deletion scores。最后，作者还在理论层面上验证了所提出的funtion确实是submodular的。

二、方法

1. Preliminaries

具体来解释一下。

定义3.1：给定有限集 $V$ ，集函数 $\mathcal{F}$ 将任意子集 $S\subseteq V$ 映射为一个实数。给定任意集合 $S_a\subseteq S_b\subseteq V$ 与元素 $\alpha$ ，并且 $\alpha$ 不在 $S_a,S_b$ 内。如果集函数同时满足以下两个条件，则可称为是submodular function：

单调非减： $\left(\mathcal{F}\left(S_{b}\cup\{\alpha\}\right)-\mathcal{F}\left(S_{b}\right)\geq0\right)$
边际收益递减： $\mathcal{F}\left(S_{a}\cup\{\alpha\}\right)-\mathcal{F}\left(S_{a}\right)\geq\mathcal{F}\left(S_{b}\cup\{\alpha\}\right)-\mathcal{F}\left(S_{b}\right)$

单调递减很好理解，集合中的元素越多，函数值越大；而边际收益递减的含义是，向小集合中添加元素带来的收益大于向大集合中添加相同元素带来的收益。为了使边际收益最大化，我们希望每次添加的新元素都会为原集合带来最大的增益，这实际上影响了元素的添加顺序：因为同一元素被添加的顺序不同，产生的收益也会不同。

Problem Formulation: 将一副图像 $\mathbf{I}$ 分解为有限数量的sub-regions ( $m$ 个)，每一个sub-region都由掩膜mask所产生。Attribution problem的目标就是用limited number $k$ of sub-regions in the set $V$ 来最大化submodular function $\mathcal{F}(S)$ 。这样一来，image attribution problem就转化为了subset selection problem。

2. Proposed Method

2.1 Sub-Region Division

传统的方法通常将图像分成patch，但是这样会忽视不同区域的语义信息。In contrast, our method employs a sub-region division strategy that is informed and guided by an a priori saliency map. 具体来说，首先将原始图像分解为 $N\times N$ 个patch regions；接着，使用existing image attribution algorithm为图像 $\mathbf{I}$ 的每个类别分别计算saliency map $\mathcal{A}\in \mathbb{R}^{w \times h}$ ，然后将 $\mathcal{A}$ resize到 $N\times N$ 的大小。这样一来，resize后的 $\mathcal{A}$ 中的每个元素值的大小就可以代表每个patch的importance。

定义一个sub-region包含原图像中的 $d$ 个patch，因为我们共有 $m$ 个sub-regions，因此 $d=N\times N/m$ 。然后根据得到的每个patch对应的importance按顺序得到sub-region set：

$V=\left\{\mathbf{I}_{1}^{M},\mathbf{I}_{2}^{M},\cdots,\mathbf{I}_{m}^{M}\right\}$

(注： $V$ 是由多个sub-region组成的集合； $\mathbf{I}_{i}^{M}$ 是第 $i$ 个sub-region，每个sub-region $\mathbf{I}_{i}^{M}$ 是由 $d$ 不同的patch组合而成的，其大小与原图像一致都是 $w\times h$ ，所有的sub-region加在一起就是原图像 $\mathbf{I}$ )

2.2 Submodular Function Design

本节作者首先为sub-region提供了四个角度，即概述中所说的置信度、有效性、一致性和集体效应。

(1) Confidence Score

作者使用Evidential Deep Learning中的证据理论来量化样本预测的不确定性(uncertainty)。对于一个K-class分类任务，给定sample $\mathbf{x}$ ，通过以下loss function对网络进行优化：

$\mathcal{L}_{EDL}=\sum\limits_{k_c=1}^K\mathbf{y}_{k_c}\left(\log S_{\mathrm{Dir}}-\log\left(\mathbf{e}_{k_c}+1\right)\right)$

由此，the confidence score of a sample $\mathbf{x}$ predicted by the network can be expressed as:

$s_\mathrm{conf.}\left(\mathbf{x}\right)=1-u=1-\frac{K}{\sum_{k_c=1}^K\left(\mathbf{e}_{k_c}+1\right)}$

(注：在后续实验中， $\mathbf{x}$ 可以是原图像 $\mathbf{I}$ 也可以是只包含少量sub-regions而其它区域被mask为0之后的图像，以进行对比)

(2) Effectiveness Score

不同的sub-regions之间包含的语义可能是相同的，而我们希望使用尽可能少的sub-regions来获得最多的有价值信息，因此选取的subset $S\subseteq V$ 中包含的sub-regions应该尽可能是diverse的，避免选择过多的包含有相同语义信息的sub-regions。

对于subset $S$ 中的元素 $\alpha$ (也就是某个sub-region $\mathbf{I}_{i}^{M}$ )， $\alpha$ 的effectiveness score可以通过以下式子计算：

$s_e\left(\alpha\mid S\right)=\min_{s_i\in S}\operatorname{dist}\left(F\left(\alpha\right),F\left(s_i\right)\right)$

在本文中，dist设置为余弦相似度。

设 $F(\cdot)$ 表示一个pre-trained feature extractor。集合 $S$ 的effectiveness score为：

$s_\mathrm{eff.}\left(S\right)=\sum_{s_i\in S}\min_{s_j\in S,s_i\neq s_j}\operatorname{dist}\left(F\left(s_i\right),F\left(s_j\right)\right)$

也就是说，对于 $S$ 中每个元素的特征图 $F(s_i)$ 都计算一遍与其它所有元素的特征图的最小余弦相似度，并求和。(元素 $s_i$ 其实就是某个sub-region，包含了 $d$ 个图像patch)

换句话说，如果选择的 $S$ 中的每个sub-region包含的语义信息彼此不同，即具有diversity，则对应的effectiveness score就高；反之，如果 $S$ 中不同sub-region之间含有的语义信息相同，effectiveness score就低。由此提高subset中包含sub-regions的diversity以及quality。

举个例子，上面黄框、绿框分别代表sub-region集 $S_1,S_2$ ，因为 $S_2$ 包含的语义信息具备更大的多样性， $S_2$ 将获得更高的有效性得分effectiveness score。

(3) Consistency Score

由于effectiveness score倾向于收集更多的语义信息作为subset，为了避免收集与目标类别无关的语义信息，使用consistency score来保持所选择的sub-regions与target的一致性。这部分直接上原文：

我们的目标是使 $F\left(\sum_{\mathbf{I}^M\in S}\mathbf{I}^M\right)$ 与 $f_{s}=F\left(\mathbf{I}\right)$ 尽可能一致，其中 $F(\cdot)$ 是pre-trained feature extractor或者直接使用分类器的全连接层。

通过这种方法，确保了所选择的sub-region集与特定语义目标相一致。

(4) Collaboration Score

协作分数——某些元素的“独立效应”不好，但是在团队里却很重要。

$s_\mathrm{colla.~}(S,\mathbf{I},\boldsymbol{f}_s)=1-\frac{F\left(\mathbf{I}-\sum_{\mathbf{I}^M\in S}\mathbf{I}^M\right)\cdot\boldsymbol{f}_s}{\left\|F\left(\mathbf{I}-\sum_{\mathbf{I}^M\in S}\mathbf{I}^M\right)\right\|\left\|\boldsymbol{f}_s\right\|}$

后面一项越小，协作分数越高；指的是，当排除掉选择的sub-regions $\sum_{\mathbf{I}^M\in S}\mathbf{I}^M$ 后，剩余的区域与target object的特征越无关，说明 $\sum_{\mathbf{I}^M\in S}\mathbf{I}^M$ 越能代表有关的特征。

而 $\sum_{\mathbf{I}^M\in S}\mathbf{I}^M$ 就代表了集体效应。

Submodular Function

$\mathcal{F}(S)=\lambda_1s_{\mathrm{conf.}}\left(\sum_{\mathbf{I}^M\in S}\mathbf{I}^M\right)+\lambda_2s_{\mathrm{eff.}}\left(S\right)+\lambda_3s_{\mathrm{cons.}}\left(S,\boldsymbol{f}_s\right)+\lambda_4s_{\mathrm{colla.}}\left(S,\mathbf{I},\boldsymbol{f}_s\right)$

为了简便起见，所有超参数的缺省值都为1。

作者证明了这样定义的 $\mathcal{F}(S)$ 满足成为submodular function的两个条件，即单调非减与边际收益递减。具体证明请参见原文补充材料。

2.3 Greedy Search Algorithm

首先收集所有的sub-region，共 $m$ 个，得到全集 $V$ ；

下一步是在 $V$ 中找到最优子集 $S$ ， $S$ 初始化为空集。寻找方法采用贪心算法，每次向 $S$ 中添加拥有最大增益的元素 $\alpha$ 使submodular function $\mathcal{F}(S)$ 的值最大化，直到找到 $k$ 个元素为止。

并且可以证明，如果 $\mathcal{F}(\cdot)$ 是一个submodular function，那么经过greedy search寻找到的 $S$ 与optimal $S^*$ 之间满足：

$\mathcal{F}(S)\geq\left(1-\frac1{\mathrm{e}}\right)\mathcal{F}(S^*)$

三、实验及结果

1. Experimental Setup

Datasets

Celeb-A, VGG-Face2, CUB-200-2011

Evaluation Metric

Deletion and Insertion AUC scores.

"We use Deletion and Insertion AUC scores to form the evaluation metric, which measures the decrease (or increase) in class probability as important pixels (given by the saliency map) are gradually removed (or increased) from the image."

Deletion AUC越小越好，Insertion AUC越大越好。(快速响应)

Baselines

White-box: saliency, Grad-CAM, Grad-CAM++ , Score-CAM;

Black-box: LIME, Kernel Shap, RISE, HSIC Attribution.

2. Faithfulness Analysis

结合了本文方法后，所有的attribution method的性能都有所提升；

并且还可以发现，本文方法的效果与underlying attribution algorithm有关，更先进的算法会产生更好的结果，HSIC+ours是表现最好的。

3. Discover the Cause of Incorrect Predictions

表格2，对于incorrect predictions，当提供不同比例的图像区域时所能达到的最高置信度(Fig.3中红线对应的最大值)；比如，0-25%指的是最多提供原图像25%的区域，此时 k = 0.25m；可以发现在不同区间内，结合了本文方法后的attribution method的效果都有所提升，并且Insertion scores也有所增大，证明了本文方法能够提供更加“显著”的saliency map。

最下面一行“Patch 10×10”代表不借助attribution method，而是将每个10×10的patch作为一个元素(每个sub-region只包含一个10×10的patch)，发现效果甚至更好。

另外比较有趣的一点，图3也显示了：对于某些错误预测的sample而言，按照本文的方法提供少部分图像信息可以修正预测结果。(图像揭露出来的部分多了反而会导致预测失败)

4. Ablation Study

此外，还验证了使用prior attribution map的有效性：

使用HSIC-Attribution作为prior取得了最佳的效果。

还可以发现：patch越大，删除后影响越大；patch越小，添加后影响越大。

exploreandconquer

关注

31
点赞
踩
25

收藏

觉得还不错? 一键收藏
打赏
2
评论
Less is More: Fewer Interpretable Region via Submodular Subset Selection (ICLR 2024, oral)

[ICLR 2024 oral] submodular set selection, attribution methods.
复制链接

扫一扫