2021.12.6 第三篇(NeurIPS 2021)半精读
原文链接:Clustering Effect of (Linearized) Adversarial Robust Models
代码链接:无
Contributions
- To the best of our knowledge, we are the first to systematically analyze the statistical regularity of adversarially robust models (through adversarial training) compared to non-robust models (through standard training) on their linearized sub-networks
- We present an intriguing phenomenon of hierarchical clustering effect in robust models, and provide a novel yet insightful understanding of adversarial robustness. The clustering effect aligned with class hierarchy demonstrates more semantic and representative feature extraction capacity of robust models, which benefits a lot in various tasks
- Based on the observations, we propose a plugged-in hierarchical clustering training strategy to generally enhance adversarial robustness and investigate some intriguing adversarial attack findings. Besides adversarial-related study, we further explore some downstream tasks with the understanding of hierarchical clustering, e.g., domain adaption with subpopulation shift. Experimental results show that the clustering effect and hierarchical classification learned by robust model benefits the task as well.
Methodology
原始DNN可以表示为eq.4,移除非线性层后的网络可以表示为eq.5。
Algorithm 1 是提取线性权重矩阵的方法。
有了上述的线性权重矩阵
W
W
W之后,根据
C
i
,
j
=
W
i
T
∣
∣
W
i
T
∣
∣
2
×
W
j
∣
∣
W
j
∣
∣
2
C_{i,j}=\frac{W_{i}^{T}}{||W_{i}^{T}||_2} \times \frac{W_{j}}{||W_{j}||_2}
Ci,j=∣∣WiT∣∣2WiT×∣∣Wj∣∣2Wj把
W
W
W归一化成
D
o
u
t
p
u
t
×
D
o
u
t
p
u
t
D_{output} \times D_{output}
Doutput×Doutput的样子。其中
C
i
,
j
C_{i,j}
Ci,j越接近1,表示第
i
i
i类和第
j
j
j类权重在线性权重空间内越相关,反之越不相关。
Figure1展示了其可视化效果,可以看出同属于非动物类的和同属于动物类的更相似,而跨大类的相似性就很低。
Figure2展示了在ResNet18在不同数据集上正常训练和对抗训练的相关性矩阵
C
C
C,可以明显看出,对抗训练后的相关性矩阵会更加趋向于把同一个大类归为一体。(原话: the similar block clustering effect is observed on robust models and aligns well with class hierarchy)
Table1展示了线性模型的精度(其实挺好奇这个精度下研究是否具有真实意义,毕竟本身他就是精度不够,感觉得做个消融实验消除精度对
C
C
C的影响才行。不过纯线性的确实不好训练,而且还要对抗训练)
Figure3展示了上述线性模型正常训练和对抗训练的
C
C
C矩阵。
针对上述层次聚类现象,作者设计了一个罚函数来增强这种聚类效果。
Figure4展示了作者设计的特征距离矩阵的效果
文章还在Domain Adaption上做了实验,由于这个领域没接触过不太懂,就不放了,怕讲错,详细可以自己看论文。
Table3验证了所提方法还能增强对抗鲁棒性
Figure7还做了个攻击混淆矩阵的效果
个人总结
这篇文章感觉实验量也很充足,不过看下来没有之前两篇那么顺,可能是对Domain Adaption不了解,而且他用的数据集都是自定义的,没有直观的概念。最后其实就是对他线性模型那块有点点疑问,毕竟直接训练线性模型和真实情况差距太远了。(以上仅个人愚见,如果有什么理解不对的地方欢迎补充和更正)