aba问题实际中有什么影响_实际影响是什么?

aba问题实际中有什么影响

机器学习 (Machine Learning)

实际影响是什么? (What is the practical impact?)

We discovered a novel way to use adversarially-trained deep neural networks (DNNs) in the context of transfer learning to quickly achieve higher accuracy on image classification tasks — even when limited training data is available. You can read our paper at https://arxiv.org/abs/2007.05869.

我们发现了一种在转移学习的上下文中使用对抗训练的深度神经网络(DNN)的新颖方法,即使在有限的训练数据可用的情况下,也可以快速实现图像分类任务的更高准确性 。 您可以在下面阅读我们的论文 https://arxiv.org/abs/2007.05869

什么是对抗训练? (What is adversarial training?)

Adversarial training modifies the typical DNN training procedure by adding an adversarial perturbation δᵢ on each input image xᵢ with the objective of generating DNNs robust to adversarial attacks. In particular, for a DNN with m input images, a loss function ℓ(), a model-predicted response h() parametrized by θ, and the true label for the iᵗʰ image yᵢ, the optimization objective of adversarial training is described mathematically in Equation 1. For more details, please see Section 3: “Explaining the Adversarial Training Process” in our paper.

对抗训练通过在每个输入图像xᵢ上添加对抗扰动δᵢ来修改典型的DNN训练过程,目的是生成对对抗攻击具有鲁棒性的DNN。 特别是,对于具有m个输入图像的DNN,损失函数ℓ( ) ,由θ参数化的模型预测响应h( )和i the图像yᵢ的真实标签,描述了对抗训练的优化目标。数学公式1有关详细信息,请参见第3节:在“解释对抗性训练过程” 我们的论文

Image for post
Image for post
Equation 1: Optimization objective in adversarial training
公式1:对抗训练中的最佳目标

在DNN的背景下,什么是转移学习? (What is transfer learning in the context of DNNs?)

Transfer learning typically trains DNNs on a rich dataset like ImageNet and then re-trains (fine-tunes) some of the last layers with the target dataset. Empirically, this method allows us to obtain higher accuracy than DNNs trained from scratch, as it was first shown by Yosinski et. al. in How transferable are features in deep neural networks? [2]. Also, this higher accuracy is attained much quicker compared to DNNs trained from scratch.

转移学习通常在像ImageNet这样的丰富数据集上训练DNN,然后使用目标数据集重新训练( 微调 )某些最后一层。 根据经验,这种方法使我们可以获得比从头开始训练的DNN更高的准确性,这是Yosinski等人首次展示的。 等 深度神经网络中的特征如何可传递? [2]。 而且,与从头开始训练的DNN相比,可以更快地获得更高的精度。

是什么激励了我们? (What motivated us?)

Two key insights explained in Tsipras et. al. Robustness may be at odds with accuracy [1]:

Tsipras等人解释了两个关键见解。 等 鲁棒性可能与准确性不符 [1]:

  1. Adversarially-trained DNNs typically have lower test accuracy than naturally-trained models. However,

    对抗训练的DNN通常比自然训练的模型具有较低的测试准确性。 然而,
  2. Adversarially-trained DNNs tend to have more humanly aligned features, as we explained below in the “Why does this work” Section below. Thus, we wondered:

    经过对抗性训练的DNN往往具有更多的与人类保持一致的功能,正如我们在下面的“为什么要这样做”一节中所解释的。 因此,我们想知道:

Is it possible that these humanly-aligned representations will give adversarially-trained models an advantage over naturally-trained ones when they’re fine-tuned to new datasets?

当将这些人为调整的表示形式微调到新的数据集时,它们有可能使对抗训练的模型优于自然训练的模型吗?

我们的实验是什么,结果是什么? (What is our experiment and what are our results?)

We comprehensively study the testing accuracy obtained from naturally fine-tuning ResNet-50 models originally trained on ImageNet either adversarially or naturally. After fine-tuning 14,400 models over 6 different variables, we concluded that adversarially-trained DNNs learn faster and with less data than naturally-trained ones. In some cases, we see a 4x reduction in training images and a 10x speedup in training speed to achieve the same accuracy benchmark as a naturally-trained source model.

我们全面研究了从最初通过对抗或自然方式在ImageNet上训练的ResNet-50模型自然微调所获得的测试准确性。 在针对6个不同的变量对14,400个模型进行微调后,我们得出结论, 与自然训练的DNN相比对抗训练的DNN学习速度更快,数据更少 。 在某些情况下,我们看到训练图像减少了4倍,训练速度提高了10倍,以达到与自然训练源模型相同的精度基准。

Figures 1(a) and 2(a) show the test accuracy on each of the target datasets as a function of the number of training images and training epochs used for fine-tuning, respectively. Figures 1(b) and 2(b) show the test accuracy delta, defined as the robust minus the natural test accuracy on the target datasets. Both (a) and (b) are consistently colored by dataset, the shaded area shows the 95% confidence interval, and the points are the mean over multiple random seeds. For more details, please see Section 5: “Results and Discussions” in our paper.

图1(a)和2(a)分别显示了每个目标数据集的测试精度与用于微调的训练图像和训练时期的数量的关系。 图1(b)和2(b)显示了测试精度增量,定义为鲁棒减去目标数据集上的自然测试精度。 (a)(b)都由数据集一致地着色,阴影区域显示了95%的置信区间,这些点是多个随机种子上的平均值。 有关详细信息,请参见第5章:在“结果和讨论” 我们的论文

Image for post
Figure 1: Adversarially-trained models transfer better with less data than naturally-trained ones
图1:经过对抗训练的模型比自然训练的模型传输更少的数据更好地传输
Image for post
Figure 2: Adversarially-trained models learn faster than naturally-trained ones
图2:经过对抗训练的模型比自然训练的模型学习更快

还有其他人研究过吗? (Has anyone else studied this?)

To the best of our knowledge, there are only two papers that study this phenomenon directly:

据我们所知,只有两篇论文直接研究了这一现象:

  1. In Adversarially Robust Transfer Learning [3], Shafahi et. al. found that adversarially-trained models transferred worse than naturally-trained. However, this seemingly contradictory conclusion can be explained by the fact that they used much larger robustness levels than us: They use ε=5 while we use ε=3.

    对抗性强健的转移学习中 [3],Shafahi等。 等 发现经过对抗训练的模型比自然训练的模型差。 但是,这个看似矛盾的结论可以通过以下事实来解释:它们使用了更大的鲁棒性 比我们高的水平:他们使用ε= 5,而我们使用ε= 3。

  2. Microsoft Research and MIT came to a similar conclusion in their very recent paper Do Adversarially Robust ImageNet Models Transfer Better? [4] and a related blogpost by Salman et.al. They focus on the effects of different network architectures, fixed feature transfer, and comparing adversarial robustness to texture robustness.

    微软研究院和麻省理工学院在他们最近的论文中得出了类似的结论:对抗性强健的ImageNet模型能否更好地传输? [4]和Salman等人的相关博客文章 。 他们专注于不同网络体系结构,固定功能转移的影响,并将对抗性健壮性与纹理健壮性进行比较。

为什么这样做? (Why does this work?)

Image for post
Figure 3: Image of German Shepherd dog
图3:德国牧羊犬的图像

To give some insight into the question of why robust transfer learning works, we can look to human perception. When we perceive objects in real-life, we infer the label of the object from the semantic properties of the object being viewed. So, for instance, when given the task of identifying the object in Figure 3, most people recognize that the tall ears, elongated snout, and brown fur pattern allude to the object being a German Shepherd dog. What makes this method of cognition powerful is that it allows humans to learn to recognize a vast amount of objects. After learning to spot German Shepherds, one may have an enhanced ability to recognize wolves, since they share visual properties.

为了深入了解为何强大的迁移学习有效的问题,我们可以着眼于人类的感知。 当我们在现实生活中感知对象时,我们从所查看对象的语义属性中推断出对象的标签。 因此,例如,在执行图3中标识对象的任务时,大多数人会意识到,高耳朵,细长的鼻子和棕色皮毛图案暗示该对象是德国牧羊犬。 使这种认知方法强大的原因在于,它允许人类学习识别大量物体。 在学会发现德国牧羊犬之后,由于它们具有共同的视觉特性,因此它们可能会增强识别狼的能力。

Consider that the dog image in Figure 3 is fed into a naturally-trained DNN and into an adversarially-trained DNN. How does each network internally represent this image? This can be visualized in Figure 4, which shows the sensitivity of the loss function given a small change in the pixels of the input image of the dog. The loss function measures how “good” are the predictions of our model. The left image shows a highly irregular and non-smooth representation, while the right one shows a representation that is humanly recognizable as a dog. Intuitively, we can say that the adversarially trained model sees the forest (i.e., features such as ears, eyes, arms, etc.), while the naturally trained model is lost in the trees (i.e. the image pixels).

考虑一下图3中的狗图像被馈送到自然训练的DNN和对抗训练的DNN。 每个网络在内部如何表示此图像? 这可以在图4中看到,该图显示了在狗的输入图像的像素发生微小变化的情况下,损失函数的灵敏度。 损失函数衡量的是模型预测的“好”程度。 左图显示了高度不规则且不平滑的表示,而右图显示了人类可识别为狗的表示。 直觉上,我们可以说经过对抗训练的模型看到了森林(即,耳朵,眼睛,手臂等特征),而自然训练的模型却丢失了树木(即图像像素)。

Image for post
Figure 4: Adversarially-trained models contain humanly-aligned representations. These images were created using the sensitivity of the loss function to small changes in the pixels of the input image of a naturally-trained (left) and adversarially-trained (right) DNN models.
图4:经过对抗训练的模型包含人类对齐的表示。 这些图像是使用损失函数对自然训练(左)和对抗训练(右)DNN模型的输入图像像素的微小变化的敏感性创建的。

We can also investigate this phenomenon from the perspective of influence functions, as described in Koh and Liang’s Understanding black-box predictions via influence functions paper [5], which essentially ask:

我们还可以从影响函数的角度研究这种现象,如Koh和Liang的《通过影响函数理解黑匣子预测》 [5]中所述,该问题本质上要求:

Which images from the training dataset are most helpful for classifying this input image?

训练数据集中的哪些图像最有助于对该输入图像进行分类?

Since adversarially-trained models contain human aligned features, we expected that the most influential images for an input image to actually look similar to the input image. We obtain results that are consistent with this theory.

由于经过对抗训练的模型包含人类对齐的特征,因此我们希望输入图像的最有影响力的图像实际上看起来与输入图像相似。 我们获得与该理论一致的结果。

Image for post
Figure 5: Adversarially-trained source models have more intuitive influential images in the target dataset. The top row contains the test image, one from each of the 10 categories. The remaining rows contain the most influential images in the training set, for the pre-trained robust and pre-trained natural models, respectively
图5:经过对抗训练的源模型在目标数据集中具有更直观的影响力图像。 第一行包含测试图像,这是10个类别中的每个类别。 剩余的行分别包含针对训练前的鲁棒模型和训练后的自然模型的训练集中最具影响力的图像

Figure 5 shows a collection of test images for a transferred DNN, with the most influential training images in the target dataset shown for both adversarial and natural trained source models. It is clear that the robust models produce the most influential images that appear more similar to the test image compared to the naturally trained source models. Figure 6 extends the notion shown in Figure 5 to quantitatively measure how much better the adversarially-trained DNN is compared to the naturally-trained DNN. In particular, it shows that highly influential images get classified correctly more often in robust models. These results show that similar-looking images get grouped together in the feature space of adversarially-trained source models, indicating that adversarially-trained models have internal feature representations better aligned with human cognition. For more details, please see Section 6: “Interpreting Representations using Influence Functions” in our paper.

图5显示了转移的DNN的测试图像集合,其中显示了针对对抗性和自然训练的源模型的目标数据集中最具影响力的训练图像。 显然,与自然训练的源模型相比,健壮模型会产生最具影响力的图像,这些图像看起来与测试图像更加相似。 图6扩展了图5所示的概念,以定量测量对抗训练的DNN与自然训练的DNN相比要好多少。 特别是,它表明,在健壮的模型中,更有影响力的图像更经常被正确分类。 这些结果表明,在经过对抗训练的源模型的特征空间中,看起来相似的图像被分组在一起,这表明经过对抗训练的模型具有与人类认知更好地匹配的内部特征表示。 欲了解更多详情,请参阅第6节:在“使用影响函数口译三个代表” 我们的论文

Image for post
Figure 6: The classes of the top influential images for adversarially-trained source models match the classes of the target images more often than the naturally-trained ones. (a) shows the proportion of top-1 through top-100 influential images that match the target using training images, while (b) does the same but using testing images as targets
图6:经过对抗训练的源模型的最有影响力的图像类别与目标图像的类别相较于自然训练的类别更频繁地匹配目标图像。 (a)使用训练图像显示了与目标匹配的前1名到前100名有影响力的图像所占的比例,而(b)进行了相同的测试,但使用测试图像作为目标

你怎么使用这个? (How can you use this?)

In contrast to many other complicated and non-standard training procedures, our adversarial transfer learning procedure is easy to implement.

与许多其他复杂且非标准的培训程序相比, 我们的 对抗式 迁移学习程序易于实施

The first step is to acquire an adversarially-trained ImageNet model, either through a 3rd party or with scratch adversarial SGD training (for more details, please see Section 3: “Explaining the Adversarial Training Process” in our paper). We recommend using the models from the robustness library linked in the README.txt. Next, re-initialize the last fully-connected layer and then fine-tune the model using the target dataset as described in Section 4 of our paper.

第一步是获得一个adversarially训练ImageNet模式,无论是通过第三方或划伤对抗训练新元(详细内容,请参见第3节:“解释对抗性训练过程”,在我们的论文 )。 我们建议使用README.txt中链接的健壮性库中的模型。 接下来,重新初始化最后一个完全连接的层,然后使用目标数据集对模型进行微调,如本文第4节所述。

Lastly, we’d like to share some tips and tricks learned as we fine-tuned more than 14,000 models:

最后,我们想分享一些我们对14,000多个模型进行微调时学到的技巧和窍门:

  • If you have less training data, fine-tune fewer layers to avoid overfitting

    如果训练数据较少,请微调较少的层,以避免过度拟合
  • Consider using our hyperparameters as a starting point and perform grid-search to improve accuracy by ~1–3%

    考虑使用我们的超参数作为起点,并执行网格搜索以将精度提高约1-3%
  • Adversarially-trained models with an ℓ-2 norm worked best for us

    经过ℓ-2范数的对抗训练模型最适合我们
  • Keep in mind that this method should work fairly well even in the low data regime

    请记住,即使在数据量较低的情况下,此方法也应能很好地工作

To see the code associated with our research, check out our GitHub.

要查看与我们的研究相关的代码,请查看我们的GitHub

下一步是什么? (What’s next?)

From a practical standpoint, we’re interested in finding ways to further increase the transfer learning accuracy and/or reduce the computational expense required to train the source model adversarially. Different source models and architectures could have a big impact on the behavior of the transfer learning process. Also, while ImageNet has become common in transfer learning as a source dataset, there are many issues with the dataset that might be at odds with transfer learning. Some issues with ImageNet include overlapping labels in images, insufficient dataset length, low-resolution photos, and outdated images (by today’s standards). Some datasets have attempted to address these concerns. In 2019, Tencent released a publicly available dataset containing 18 million images and 11 thousand classes, making it the largest annotated image dataset in the world.

实践的角度来看,我们有兴趣寻找进一步提高转移学习准确性和/或减少对抗性训练源模型所需的计算费用的方法。 不同的源模型和体系结构可能会对迁移学习过程的行为产生重大影响。 同样,尽管ImageNet已作为源数据集在转移学习中变得很普遍,但数据集存在许多问题,可能与转移学习背道而驰。 ImageNet的一些问题包括图像中的标签重叠,数据集长度不足,低分辨率照片和过时的图像(按当今的标准)。 一些数据集已尝试解决这些问题。 在2019年,腾讯发布了一个公开可用的数据集,包含1800万张图像和11000个类别,使其成为世界上最大的带注释的图像数据集。

From a theoretical standpoint, we should explore why this phenomenon occurs in the first place. Even though we’ve looked at robust transfer learning behavior through the lens of influence functions, we cannot definitively explain why adversarially-trained models transfer better. It’s possible that this phenomenon is related to representation-based learning and/or semi-supervised learning. Perhaps we can leverage some of the theory in these related fields to gain an even deeper understanding that explains why adversarially-trained DNNs transfer better.

理论上讲 ,我们应该首先探讨为什么会出现这种现象。 即使我们已经通过影响函数的角度研究了稳健的转移学习行为,我们也不能确切地解释为什么经过对抗训练的模型可以更好地转移。 这种现象可能与基于表示的学习和/或半监督学习有关。 也许我们可以利用这些相关领域中的某些理论来获得更深入的理解,这可以解释为什么经过对抗训练的DNN可以更好地进行传递。

致谢 (Acknowledgments)

I’d like to thank my blog co-author Evan Kravitz, as well as helpful comments and edits from Benjamin Erichson, Rajiv Khanna, and Michael Mahoney.

我要感谢我的博客合著者Evan Kravitz,以及Benjamin Erichson,Rajiv Khanna和Michael Mahoney的有益评论和编辑。

资料来源 (Sources)

[1] Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. Robustness may be at odds with accuracy. ICLR (2019).Relevance to our work — Adversarially-trained models underperform naturally-trained ones when they’re evaluated on the same source dataset (i.e. not transferred). However, “the representations learned by robust models tend to align better with salient data characteristics and human perception”.

[1] Dimitris Tsipras,Shibani Santurkar,Logan Engstrom,Alexander Turner和Aleksander Madry。 鲁棒性可能与准确性不符 。 ICLR(2019)。 与我们的工作相关性—在相同的源数据集上进行评估(即不转移)时,经过对抗训练的模型的性能要优于自然训练的模型。 但是,“通过健壮的模型学习的表示倾向于与突出的数据特征和人类感知更好地吻合”。

[2] Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. How transferable are features in deep neural networks? NeurIPS (2014). Relevance to our work — Landmark transfer learning paper finding that “initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset”.

[2] Jason Yosinski,Jeff Clune,Yoshua Bengio和Hod Lipson。 深度神经网络中的特征如何可传递? NeurIPS(2014)。 与我们的工作相关-具有里程碑意义的转移学习论文发现,“初始化几乎具有任意层的具有转移特征的网络,可以提高泛化能力,即使在微调到目标数据集之后也可以持续存在”。

[3] Ali Shafahi, Parsa Saadatpanah, Chen Zhu, Amin Ghiasi , Cristoph Studer. Adversarially Robust Transfer Learning. ICLR(2020).Relevance to our work — First paper studying transfer learning of adversarially-trained DNNs showing that DNNs trained with a high tolerance for adversarial perturbations (i.e. ε=5) do not transfer as well as naturally-trained models [Table 3].

[3] Ali Shafahi,Parsa Saadatpanah,Chen Zhu,Amin Ghiasi,Cristoph Studer。 对抗鲁棒的迁移学习 ICLR(2020年)。 与我们工作的相关性–研究对抗训练的DNN的转移学习的第一篇论文表明,训练有较高的对抗性摄动能力(即ε= 5)的DNN不会像自然训练的模型那样转移[表3]。

[4] Hadi Salman, Andrew Ilyas, Logan Engstrom, Ashish Kapoor, Aleksander Madry. Do Adversarially Robust ImageNet Models Transfer Better? ArXiv (2020)Relevance to our work — Most recent work further validating our conclusion that Adversarially-trained models transfer better by analyzing different network architectures width, fine-tuning the entire DNN, and fixed feature transfer.

[4] Hadi Salman,Andrew Ilyas,​​Logan Engstrom,Ashish Kapoor,Aleksander Madry。 对抗性强的ImageNet模型传输效果更好吗? ArXiv(2020) 与我们的工作相关-最新的工作进一步验证了我们的结论,即通过分析不同的网络架构宽度,微调整个DNN和固定的特征转移,对抗训练的模型可以更好地转移。

[5] Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence function. ICML (2017).Relevance to our work — Provides the influence function framework we used to better understand why Adversarially-trained models transfer better.

[5]庞伟哥和珀西·梁。 通过影响函数了解黑匣子预测 ICML(2017)。 与我们工作的相关性-提供我们用来更好地理解为何经过对抗训练的模型可以更好地传递的影响函数框架。

翻译自: https://medium.com/towards-artificial-intelligence/adversarially-trained-deep-nets-transfer-better-af54c82580f6

aba问题实际中有什么影响

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值