IMPROVING THE GENERALIZATION OF ADVERSARIAL TRAINING WITH DOMAIN ADAPTATION论文解读

3 Adversarial training with domain adaptation

In this work, instead of focusing on a better sampling strategy to obtain representative adversarial data from the adversarial domain, we are especially concerned with the problem of how to train with clean data and adversarial examples from the efficient FGSM, so that the adversarially trained model is strong in generalization for different adversaries and has a low computational cost during the training.

We propose an Adversarial Training with Domain Adaptation (ATDA) method to defense adversarial attacks and expect the learned models generalize well for various adversarial examples. Our motivation is to treat the adversarial training on FGSM as a domain adaptation task with limited number of target domain samples, where the target domain denotes adversarial domain. We combine standard adversarial training with the domain adaptor, which minimizes the domain gap between clean examples and adversarial examples. In this way, our adversarially trained model is effective on adversarial examples crafted by FGSM but also shows great generalization on other adversaries.

3.1 domain adaptation on logit space
3.1.1 unsupervised domain adaptation

假设我们从clean数据域 D \mathcal{D} D中获取了干净训练样本 { x i } ( x i ∈ R d ) \{x_i\}(x_i\in\mathbb{R}^d) {xi}(xiRd),标签为 { y i } \{y_i\} {yi},其对应的的对抗样本 { x a d v } ( x i a d v ∈ R d ) \{x^{adv}\}(x_i^{adv}\in\mathbb{R}^d) {xadv}(xiadvRd)来自于adversarial data domain A \mathcal{A} A。The adversarial examples are obtained by sampling ( x i , y t r u e ) (x_i,y_{true}) (xi,ytrue) from D \mathcal{D} D, computing small perturbations on x i x_i xi to generate adversarial perturbations,and outputting ( x i a d v , y t r u e ) (x_i^{adv},y_{true}) (xiadv,ytrue)

It’s known that there is a huge shift in the distributions of clean data and adversarial data in the high-level representation space. Assume that in the logit space, data from either the clean domain or the adversarial domain follow a multivariate normal distribution, i.e., D ∼ N ( μ D , Σ D ) \mathcal{D}\sim\mathcal{N}(\mu_D,Σ_D) DN(μD,ΣD), A ∼ N ( μ A , Σ A ) \mathcal{A}\sim\mathcal{N}(\mu_A,Σ_A) AN(μA,ΣA)。Our goal is to learn the logits representation that minimizes the shift by aligning the covariance matrices and the mean vectors of the clean distribution and the adversarial distribution.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值