通过多视图信息瓶颈学习鲁棒表征

论文题目:Learning Robust Representations via Multi-View Information Bottleneck

Summary

  • 论文通过将两个视图学习得到的公共信息作为有用表征,将两个视图不共享的部分信息看作是冗余信息,最终两个视图之间相互学习得到标签信息丰富和鲁棒性强的表征。

Problem Statement

  • 论文旨在通过优化信息瓶颈理论,构建损失函数使得表征能够含有更多的标签信息和鲁棒性。

Method

  • 通过理论证明,得到两个损失函数,一个用于学习表征,一个用于去除冗余信息:
    在这里插入图片描述

Evaluation

  • 多视图数据集:直接提取多视图信息
  • 单视图数据集:通过数据增强,构建两个视图,再在两个视图上进行学习。

Conclusion

  • 通过同时增加标签信息和减小冗余信息,可以学习得到较好的表征。

Notes

  • 信息平面:
    在这里插入图片描述

References

  • Learning Representations by Maximizing Mutual Information Across Views.
  • Learning deep representations by mutual information estimation and maximization.
  • How do humans sketch objects?
  • Contrastive Multiview Coding.
  • On Mutual Information Maximization for Representation Learning.
  • Multi-view learning overview: Recent progress and new challenges.

Code

  • 两个变量之间的互信息估计:
# Auxiliary network for mutual information estimation
class MIEstimator(nn.Module):
    def __init__(self, size1, size2):
        super(MIEstimator, self).__init__()

        # Vanilla MLP
        self.net = nn.Sequential(
            nn.Linear(size1 + size2, 1024),
            nn.ReLU(True),
            nn.Linear(1024, 1024),
            nn.ReLU(True),
            nn.Linear(1024, 1),
        )

    # Gradient for JSD mutual information estimation and EB-based estimation
    def forward(self, x1, x2):
        pos = self.net(torch.cat([x1, x2], 1))  # Positive Samples
        neg = self.net(torch.cat([torch.roll(x1, 1, 0), x2], 1)) #roll
        return -softplus(-pos).mean() - softplus(neg).mean(), pos.mean() - neg.exp().mean() + 1
  • 多视图信息瓶颈表征学习模型:
from training.multiview_infomax import MVInfoMaxTrainer
from utils.schedulers import ExponentialScheduler
###############
# MIB Trainer #
###############
class MIBTrainer(MVInfoMaxTrainer):
    def __init__(self, beta_start_value=1e-3, beta_end_value=1,
                 beta_n_iterations=100000, beta_start_iteration=50000, **params):
        # The neural networks architectures and initialization procedure is analogous to Multi-View InfoMax
        super(MIBTrainer, self).__init__(**params)

        # Definition of the scheduler to update the value of the regularization coefficient beta over time
        self.beta_scheduler = ExponentialScheduler(start_value=beta_start_value, end_value=beta_end_value,
                                                   n_iterations=beta_n_iterations, start_iteration=beta_start_iteration)

    def _compute_loss(self, data):
        # Read the two views v1 and v2 and ignore the label y
        v1, v2, _ = data

        # Encode a batch of data
        p_z1_given_v1 = self.encoder_v1(v1)
        p_z2_given_v2 = self.encoder_v2(v2)

        # Sample from the posteriors with reparametrization
        z1 = p_z1_given_v1.rsample()
        z2 = p_z2_given_v2.rsample()

        # Mutual information estimation
        mi_gradient, mi_estimation = self.mi_estimator(z1, z2)
        mi_gradient = mi_gradient.mean()
        mi_estimation = mi_estimation.mean()

        # Symmetrized Kullback-Leibler divergence
        kl_1_2 = p_z1_given_v1.log_prob(z1) - p_z2_given_v2.log_prob(z1)
        kl_2_1 = p_z2_given_v2.log_prob(z2) - p_z1_given_v1.log_prob(z2)
        skl = (kl_1_2 + kl_2_1).mean() / 2.

        # Update the value of beta according to the policy
        beta = self.beta_scheduler(self.iterations)

        # Logging the components
        self._add_loss_item('loss/I_z1_z2', mi_estimation.item())
        self._add_loss_item('loss/SKL_z1_z2', skl.item())
        self._add_loss_item('loss/beta', beta)

        # Computing the loss function
        loss = - mi_gradient + beta * skl

        return loss
  • 3
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 5
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值