Clustering Effect of (Linearized) Adversarial Robust Models

最新推荐文章于 2023-01-03 15:57:15 发布

Daft shiner

最新推荐文章于 2023-01-03 15:57:15 发布

阅读量274

点赞数

分类专栏：论文分享文章标签：深度学习

本文链接：https://blog.csdn.net/weixin_46782905/article/details/121736715

版权

论文分享专栏收录该内容

29 篇文章 5 订阅

订阅专栏

2021.12.6 第三篇（NeurIPS 2021）半精读
原文链接：Clustering Effect of (Linearized) Adversarial Robust Models
代码链接：无
在这里插入图片描述

Contributions

To the best of our knowledge, we are the first to systematically analyze the statistical regularity of adversarially robust models (through adversarial training) compared to non-robust models (through standard training) on their linearized sub-networks
We present an intriguing phenomenon of hierarchical clustering effect in robust models, and provide a novel yet insightful understanding of adversarial robustness. The clustering effect aligned with class hierarchy demonstrates more semantic and representative feature extraction capacity of robust models, which benefits a lot in various tasks
Based on the observations, we propose a plugged-in hierarchical clustering training strategy to generally enhance adversarial robustness and investigate some intriguing adversarial attack findings. Besides adversarial-related study, we further explore some downstream tasks with the understanding of hierarchical clustering, e.g., domain adaption with subpopulation shift. Experimental results show that the clustering effect and hierarchical classification learned by robust model benefits the task as well.

Methodology

在这里插入图片描述
原始DNN可以表示为eq.4，移除非线性层后的网络可以表示为eq.5。

Algorithm 1 是提取线性权重矩阵的方法。

有了上述的线性权重矩阵 $W$ 之后，根据 $C_{i,j}=\frac{W_{i}^{T}}{||W_{i}^{T}||_2} \times \frac{W_{j}}{||W_{j}||_2}$ 把 $W$ 归一化成 $D_{output} \times D_{output}$ 的样子。其中 $C_{i,j}$ 越接近1，表示第 $i$ 类和第 $j$ 类权重在线性权重空间内越相关，反之越不相关。
在这里插入图片描述
Figure1展示了其可视化效果，可以看出同属于非动物类的和同属于动物类的更相似，而跨大类的相似性就很低。

Figure2展示了在ResNet18在不同数据集上正常训练和对抗训练的相关性矩阵 $C$ ,可以明显看出，对抗训练后的相关性矩阵会更加趋向于把同一个大类归为一体。（原话： the similar block clustering effect is observed on robust models and aligns well with class hierarchy）
在这里插入图片描述
Table1展示了线性模型的精度(其实挺好奇这个精度下研究是否具有真实意义，毕竟本身他就是精度不够，感觉得做个消融实验消除精度对 $C$ 的影响才行。不过纯线性的确实不好训练，而且还要对抗训练)

Figure3展示了上述线性模型正常训练和对抗训练的 $C$ 矩阵。
在这里插入图片描述

针对上述层次聚类现象，作者设计了一个罚函数来增强这种聚类效果。

Figure4展示了作者设计的特征距离矩阵的效果