[CVPR 2022] HCSC: hierarchical contrastive selective coding

连理o

于 2023-03-12 14:39:18 发布

阅读量202

点赞数

文章标签： CVPR 2022

本文链接：https://blog.csdn.net/weixin_42437114/article/details/129355037

版权

39 篇文章 1 订阅

订阅专栏

Introduction

现有的自监督对比学习方法可以分为 instance-wise contrastive learning 和 prototypical contrastive learning ，它们都将数据集的语义结构表征为若干组聚类中心，而无法表征图像数据集中存在的层级性的语义结构
为此，作者提出 Hierarchical Contrastive Selective Coding (HCSC)，通过 hierarchical prototypes 来让模型学得数据集中的隐式层级结构，并选择出更高质量的正负样本对，i.e., positive pairs with similar semantics and negative pairs with distinct semantics

作者通过 hierarchical K-means algorithm 来构建 hierarchical prototypes. 首先在训练集的图像特征 $Z$ 上做 $K$ 均值聚类可以得到第一层的 prototypes，接着在 $l - 1$ 层的 prototypes 上做 $K$ 均值聚类即可得到 $l$ 层的 prototypes. 相邻层级的 prototypes 间再根据聚类过程用边相互连接形成树形结构
其中， $M_l$ 为 $l$ 层的 prototypes 数量， $L$ 为层级结构的层数，均为超参。作者的预训练数据集为 ImageNet，设置的 clustering hyperparameters 为 $L = 3$ ， $M_1,M_2,M_3)=(3000,2000,1000)$ ，样本数少于 10 的 cluster 将被丢弃
由于图像特征是在不断更新的，因此 hierarchical prototypes 也应该在训练过程中不断更新。出于对模型精度和性能的权衡，作者选择在每个 epoch 开始时更新 hierarchical prototypes

对于 instance-wise contrastive learning 而言，正样本可以通过数据增强得到，但现有方法不能保证选出的负样本是真正的语义不相似的样本，这会损害对比学习性能。为了解决上述问题，作者利用 hierarchical prototypes 选出与 anchor 语义不同的负样本

首先定义出 image representation $z$ 到 prototype $c\in C$ 的距离
其中 $Z_c$ 为 cluster $c$ 里所有图像表征的集合， $\epsilon=10$
然后就可以根据该距离在层次结构的每一层上都进行负样本选取。对于 $l$ 层，与 $z$ 距离最近的 cluster 为
negative candidate $z_j\in\mathcal N$ 与 $c^l(z)$ 的距离越远，被选为负样本的概率越大，这能使得我们有更大概率选出与 $z$ 语义不相近的负样本
根据上述概率，通过伯努利采样即可选出负样本集合
通过在 $L$ 层上进行负样本选取，可以采样出更加多样化的负样本，得到 $L$ negative sample sets $\{\mathcal N_{\text{select}}^l(z)\}_{l=1}^L$ 作为最终的负样本集合
instance-wise contrastive selective coding (ICSC)
其中 $p_d$ 为数据分布， $z^{'}$ 为数据增强得到的正样本， $\tau=0.2$

对于 prototypical contrastive learning，作者也是采用类似的思路，利用 hierarchical prototypes 选出与 anchor 语义不同的 negative clusters

首先定义出 image representation $z$ 到 prototype $c\in C$ 的距离
其中 $Z_c$ 为 cluster $c$ 里所有图像表征的集合
然后就可以根据该距离在层次结构的每一层上都进行 negative clusters 选取。对于 $l$ 层，与 $z$ 距离最近的 prototype 为
可以将 $z,c^l(z))$ 当作 positive pair，其余 prototypes 作为 candidates of negative clusters $\mathcal N^l$ . prototype $c_j$ 被选中的概率为
其中， $\text{Parent}(c^l(z))$ 为 $c^l(z)$ 的 parent node，所有位于 top hierarchy 的 prototypes 对应的选取概率均为 1. 根据上述概率，通过伯努利采样即可选出负样本集合
prototypical contrastive selective coding (PCSC)
其中， $\tau_c$ 为 cluster-specific temperature parameter (can be adaptively determined by some clustering statistics.)

instance-wise contrastive learning 主要挖掘 the local instance-level structures, 而 prototypical contrastive learning 则主要构建 global semantic structures
因此作者同时使用了两种损失函数