方法

two-view contrastive learning

L c o n t r a s t = − E S [ log ⁡ h θ ( x ) h θ ( x ) + ∑ i = 1 k h θ ( y i ) ] \mathcal{L}_{contrast}=-\mathbb{E}_{S}[\log{\frac{h_{\theta}(x)}{h_{\theta}(x)+\sum_{i=1}^{k}h_{\theta}(y_{i})}}]

L c o n t r a s t V 1 , V 2 = − E { v 1 1 , v 2 1 , v 2 2 , . . . , v 2 k + 1 , } [ log ⁡ h θ ( { v 1 1 , v 2 1 } ) ∑ j = 1 k + 1 h θ ( { v 1 1 , v 2 j } ) ] \mathcal{L}_{contrast}^{V_{1},V_{2}}=-\mathbb{E}_{\{v_{1}^{1},v_{2}^{1},v_{2}^{2},...,v_{2}^{k+1},\}}[\log{\frac{h_{\theta}(\{v_{1}^{1},v_{2}^{1}\})}{\sum_{j=1}^{k+1}h_{\theta}(\{v_{1}^{1},v_{2}^{j}\})}}]

critic h θ ( ⋅ ) h_{\theta}(\cdot)

h θ ( ⋅ ) h_{\theta}(\cdot) 是一个神经网络，采用编码器 f θ 1 ( ⋅ ) f_{\theta_{1}}(\cdot) f θ 2 ( ⋅ ) f_{\theta_{2}}(\cdot) 来分别编码输入样本 v 1 v_{1} v 2 v_{2} ，得到的表示来计算余弦相似度：
h θ ( { v 1 , v 2 } ) = exp ⁡ ( f θ 1 ( v 1 ) ⋅ f θ 2 ( v 2 ) ∣ ∣ f θ 1 ( v 1 ) ∣ ∣ ⋅ ∣ ∣ f θ 2 ( v 2 ) ∣ ∣ ⋅ 1 τ ) h_{\theta}(\{v_{1},v_{2}\})=\exp{(\frac{f_{\theta_{1}}(v_{1})\cdot f_{\theta_{2}}(v_{2})}{||f_{\theta_{1}}(v_{1})||\cdot ||f_{\theta_{2}}(v_{2})||}\cdot \frac{1}{\tau})}
τ \tau 是超参数来动态调节范围。
L c o n t r a s t V 1 , V 2 \mathcal{L}_{contrast}^{V_{1},V_{2}} 是将 V 1 V_{1} 视为anchor并枚举 V 2 V_{2} ，对称地，将 V 2 V_{2} 视为anchor并枚举 V 1 V_{1} ，将两者相加作为two views loss:
L ( V 1 , V 2 ) = L c o n t r a s t V 1 , V 2 + L c o n t r a s t V 2 , V 1 \mathcal{L}(V_{1},V_{2})=\mathcal{L}_{contrast}^{V_{1},V_{2}}+\mathcal{L}_{contrast}^{V_{2},V_{1}}

与互信息（MI）的联系

I ( z i ; z j ) ≥ log ⁡ ( k ) − L c o n t r a s t I(z_{i};z_{j})\geq \log{(k)}-\mathcal{L}_{contrast}

contrastive learning 扩展到>2个view

core view模式是针对某个view V 1 V_{1} ,与其他view进行contrastive learning，此时一共有3个目标：
L C = ∑ j = 2 4 L ( V 1 , V j ) \mathcal{L}_{C}=\sum_{j=2}^{4}\mathcal{L}(V_{1},V_{j})

L F = ∑ 1 < i < j < 4 L ( V i , V j ) \mathcal{L}_{F}=\sum_{1< i< j< 4}\mathcal{L}(V_{i},V_{j})

Memory bank

Memory bank是一个存储中间latent representation缓冲区，当我们retrieve k k 个负样本时，就从Memory bank获取而不用重新计算representation。Memory bank在训练过程中是动态更新的，每得到样本新的representation，即以滑动的形式更新：
r e p = ( 1 − m o m e n t u m ) ∗ o l d _ r e p + m o m e n t u m ∗ n e w _ r e p rep=(1-momentum)*old\_rep+momentum*new\_rep
momentum一般取0.5，Memory bank的好处是能够快速得到很多负样本的representation，缺点是representation可能略微陈旧。

02-07 8422

11-26 2576
06-08 7498
11-13 4万+
02-27 4391
01-21 2万+
05-18 4275