Contents
Introduction
- 作者提出 Hyperbolic Contrastive Learning (HCL) 来在双曲空间中进行自监督对比学习,并进一步将 HCL 推广为了 Supervised Hyperbolic Contrastive Learning. 此外,作者还尝试了提出了 Robust Hyperbolic Contrastive Learning (RHCL),将 HCL 和对抗训练结合
Method
Hyperbolic Contrastive Learning
- 其中, i ∈ I = { 1 , . . . , 2 N } i\in I=\{1,...,2N\} i∈I={1,...,2N} 为 multiviewed batch (i.e., 数据增强后 batch) 内样本的索引, j ( i ) j(i) j(i) 为与 i i i 从同一图像得到的增强图像索引, A ( i ) = I \ { i } A(i)=I\backslash\{i\} A(i)=I\{i}, z i = exp 0 c ( P r o j ( E n c ( x ~ i ) ) ) z_i=\exp_0^c(Proj(Enc(\tilde x_i))) zi=exp0c(Proj(Enc(x~i))), x ~ i \tilde x_i x~i 为数据增强后的图像, D D D 为双曲空间中两点间的距离
Supervised Hyperbolic Contrastive Learning
Adversarial Robustness of HCL
- Robust Contrastive Learning (RoCL) 将自监督学习和对抗训练相结合,对抗样本为
其中, x ~ \tilde x x~ 为 augmented anchor point, x ~ + \tilde x^+ x~+ 和 x ~ − \tilde x^- x~− 分别为正负样本, B ( x ~ , ϵ ) B(\tilde x,\epsilon) B(x~,ϵ) 为 x ~ \tilde x x~ 处半径为 ϵ \epsilon ϵ 的 ℓ ∞ \ell_\infty ℓ∞ norm-ball, Π \Pi Π 为 projection function for norm-ball. 损失函数为
L self ( x ~ , { x ~ + , x ~ a d v } , { x ~ − } ) + λ L self ( x ~ adv , x ~ + , { x ~ − } ) \mathcal{L}^{\text {self }}\left(\tilde{\mathrm{x}},\left\{\tilde{\mathrm{x}}^{+}, \tilde{\mathrm{x}}^{a d v}\right\},\left\{\tilde{\mathrm{x}}^{-}\right\}\right)+\lambda\mathcal{L}^{\text {self }}\left(\tilde{\mathrm{x}}^{\text {adv }}, \tilde{\mathrm{x}}^{+},\left\{\tilde{\mathrm{x}}^{-}\right\}\right) Lself (x~,{x~+,x~adv},{x~−})+λLself (x~adv ,x~+,{x~−}) - 相应地,Robust Hyerbolic Contrastive loss 为
L h y p self ( x ~ , { x ~ + , x ~ a d v } , { x ~ − } ) + λ L h y p self ( x ~ adv , x ~ + , { x ~ − } ) \mathcal{L}_{hyp}^{\text {self }}\left(\tilde{\mathrm{x}},\left\{\tilde{\mathrm{x}}^{+}, \tilde{\mathrm{x}}^{a d v}\right\},\left\{\tilde{\mathrm{x}}^{-}\right\}\right)+\lambda\mathcal{L}_{hyp}^{\text {self }}\left(\tilde{\mathrm{x}}^{\text {adv }}, \tilde{\mathrm{x}}^{+},\left\{\tilde{\mathrm{x}}^{-}\right\}\right) Lhypself (x~,{x~+,x~adv},{x~−})+λLhypself (x~adv ,x~+,{x~−})
Experiments and Results
- Self-supervised Learning Results. When projecting the data from Euclidian space to hyperbolic space, we define the curvature
c
=
0.1
c = 0.1
c=0.1 except for CIFAR-10
c
=
0.6
c = 0.6
c=0.6.
- Supervised Classification. We tune the curvature
c
c
c with 0.1, 0.2.
- Adversarial Robustness.
A
n
a
t
\mathcal A_{nat}
Anat is the accuracy of clean image and the
ℓ
∞
\ell_∞
ℓ∞ is the adversarial attack. (
ℓ
∞
\ell_\infty
ℓ∞ 列的指标为 robust accuracy (29.94) against the target
ℓ
∞
\ell_\infty
ℓ∞ attacks)
- Ablation Studies
(1) Curvature
(2) Normalization. 不管是 SimCLR 还是 SCL,projector 后都会接上归一化层来将向量投影到单位球体上,SCL 的作者指出归一化层能帮助模型隐式地进行难样本挖掘,有着非常重要的作用。作者在实验中发现在指数变换前进行归一化是十分必要的 (The accuracy of CIFAR-10 drops from 87.98 to 85.37 without normalization. The performance of CIFAR-100 drops from 55.89 to 46.44.)