IMPROVING THE GENERALIZATION OF ADVERSARIAL TRAINING WITH DOMAIN ADAPTATION论文解读

最新推荐文章于 2023-10-08 10:44:02 发布

你回到了你的家

最新推荐文章于 2023-10-08 10:44:02 发布

阅读量239

点赞数

分类专栏：论文解读文章标签：人工智能

本文链接：https://blog.csdn.net/kking_edc/article/details/123133844

版权

论文解读专栏收录该内容

38 篇文章 6 订阅

订阅专栏

3 Adversarial training with domain adaptation

In this work, instead of focusing on a better sampling strategy to obtain representative adversarial data from the adversarial domain, we are especially concerned with the problem of how to train with clean data and adversarial examples from the efficient FGSM, so that the adversarially trained model is strong in generalization for different adversaries and has a low computational cost during the training.

We propose an Adversarial Training with Domain Adaptation (ATDA) method to defense adversarial attacks and expect the learned models generalize well for various adversarial examples. Our motivation is to treat the adversarial training on FGSM as a domain adaptation task with limited number of target domain samples, where the target domain denotes adversarial domain. We combine standard adversarial training with the domain adaptor, which minimizes the domain gap between clean examples and adversarial examples. In this way, our adversarially trained model is effective on adversarial examples crafted by FGSM but also shows great generalization on other adversaries.

3.1 domain adaptation on logit space

3.1.1 unsupervised domain adaptation

假设我们从clean数据域 $\mathcal{D}$ 中获取了干净训练样本 $\{x_i\}(x_i\in\mathbb{R}^d)$ ，标签为 ${y_i\}$ ，其对应的的对抗样本 $\{x^{adv}\}(x_i^{adv}\in\mathbb{R}^d)$ 来自于adversarial data domain $\mathcal{A}$ 。The adversarial examples are obtained by sampling $x_i,y_{true})$ from $\mathcal{D}$ , computing small perturbations on $x_i$ to generate adversarial perturbations,and outputting $x_i^{adv},y_{true})$ 。

It’s known that there is a huge shift in the distributions of clean data and adversarial data in the high-level representation space. Assume that in the logit space, data from either the clean domain or the adversarial domain follow a multivariate normal distribution, i.e., $\mathcal{D}\sim\mathcal{N}(\mu_D,Σ_D)$ , $\mathcal{A}\sim\mathcal{N}(\mu_A,Σ_A)$ 。Our goal is to learn the logits representation that minimizes the shift by aligning the covariance matrices and the mean vectors of the clean distribution and the adversarial distribution.

你回到了你的家

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
IMPROVING THE GENERALIZATION OF ADVERSARIAL TRAINING WITH DOMAIN ADAPTATION论文解读

3 Adversarial training with domain adaptationIn this work, instead of focusing on a better sampling strategy to obtain representative adversarial data from the adversarial domain, we are especially concerned with the problem of how to train with clean dat
复制链接

扫一扫