On the Robustness of Semantic Segmentation Models to Adversarial Attacks论文解读

摘要

在本文中,我们使用两个大型数据集,首次对现代语义分割模型上的对抗攻击进行了严格评估。我们分析了不同的网络结构、模型capacity和多尺度处理的影响,并表明在分类任务上进行的许多观察并不总是转移到这个更复杂的任务上。此外,我们还展示了深层结构模型和多尺度处理中的mean-field inference如何自然地实现最近提出的对抗防御。我们的观察结果将有助于未来理解和防御对抗样本。此外,在短期内,我们展示了由于其固有的鲁棒性,目前在安全关键应用中应该优先选择哪些语义分割模型。

1 介绍

6 多尺度预处理以及对抗样本的转移性

6.1 多尺度处理

Deeplab v2网络以三种不同的分辨率(50%、75%和100%)处理图像,其中权重在每个尺度branches之间共享。每个尺度的结果会上采样到一个共同的分辨率,然后进行max-pooled处理,such that the most confident prediction at each pixel from each of the scale branches is chosen [15]。这个网络通过这种多尺度方式进行训练,although it is possible to perform this multiscale ensembling as a post-processing step at test-time only [14, 19, 42, 67]。

我们假设,当以某一尺度进行对抗攻击时,结果很难迁移到另外一个尺度上。这是因为CNN对尺度和一系列其他变换没有不变性[24,35]。尽管可以从多个不同尺度的输入中产生对抗样本,但这些对抗样本在单个尺度上可能没有那么有效,这使得在多尺度上处理图像的网络更加健壮。我们在6.2节研究了在一个尺度上产生的对抗性扰动的可转移性,并在另一个尺度上进行了评估。我们在6.3节研究了多尺度网络的鲁棒性以及可转移性。Thereafter,we relate our findings to concurrent work。

6.2 不同尺度下对抗样本的可转移性

表1显示了FGSM和迭代FGSM ll攻击的结果。对角线显示的是“白盒”攻击,其对抗样本是从受攻击的网络生成的。正如预期的那样,这些攻击通常会导致最大的性能下降。非对角线表示其他网络产生的扰动的可转移性。与迭代FGSM ll相比,FGSM攻击可以很好地转移到其他网络,这证实了在图像分类背景下进行的观察[41]。

在这里插入图片描述

从50%分辨率输入产生的攻击很难转移到其他尺度的Deeplab v2和其他架构,反之亦然。通过分别查看表1的列和行可以看出这一点。所有其他模型,FCN(VGG和ResNet)和Deeplab v2 VGG均以100%分辨率进行了训练,表1展示了多尺度以及100%分辨率的Deeplab v2模型转移性最好。这支持了这样一种假设,即在一个尺度上产生的对抗攻击在另一个尺度上有效性较差,因为CNN并不是尺度不变的(网络activations会发生显著变化)。

6.3 多尺度网络以及对抗样本

Deeplab v2的多尺度版本对白盒攻击(表1,图2)以及单尺度网络产生的扰动最为鲁棒。此外,它产生的攻击也转移到其他网络的效果也是最好的,如粗体条目所示。这可能是因为从该模型生成的攻击是同时从多个输入分辨率生成的。对于迭代FGSM ll攻击,只有来自多尺度版本的Deeplab v2的扰动才能很好地转移到其他网络,从而实现与白盒攻击类似的IoU比率。然而,只有在攻击不同规模的Deeplab时才会出现这种情况。虽然来自多尺度Deeplab v2的扰动在FCN上的转移性比来自单尺度输入的转移性更好,但它们仍然远未达到白盒攻击的效果(在FCN-VGG上的IoU比率为15.2%,在FCN ResNet上的IoU比率为26.4%)。

从FCN8多尺度输入中生成的对抗扰动(同样仅在但尺度条件下进行训练)表现类似:
FCN8 with multiscale inputs is more robust to white-box attacks, and its perturbations transfer better to other networks. This suggests that the observations seen in Tab. 1 are not properties of training the network, but rather the fact that CNNs are not scale invariant. Furthermore, an alternative to max-pooling the predictions at each scale is to average them. Average-pooling produces similar results to max-pooling. Details of these experiments, along with results using different attacks and l ∞ l_{\infty} l norms ( ϵ \epsilon ϵ values), are presented in the supplementary.

6.4 对抗样本的变换

正如Lu等人[46]所指出的那样,对抗样本在不同的尺度和变换下不具有很好的转移性。由于CNN对许多类型的转换(包括尺度)[24]没有不变性, adversarial examples undergoing them will not be as malicious since the activations of the network change greatly compared to the original input。虽然我们已经证明,网络更容易受到多尺度黑箱扰动的影响,但可能还有其他更难以建模的变换。This effectively makes it more challenging to produce physical adversarial examples in the real world [47] which can be processed from a wide range of viewpoints and camera distortions.

6.5 和其他防御机制的联系

Our observations relate to the “random resizing” defense of [61] in concurrent work. Here, the input image is randomly resized and then classified. This defense exploits (but does not attribute its efficacy to) the fact that CNNs are not scale invariant and that adversarial examples were only generated at the original scale. We hypothesise that this defense could be defeated by creating adversarial attacks from multiple scales, as done in this work and concurrently in [3]

Adversarial attacks are a major concern in the field of deep learning as they can cause misclassification and undermine the reliability of deep learning models. In recent years, researchers have proposed several techniques to improve the robustness of deep learning models against adversarial attacks. Here are some of the approaches: 1. Adversarial training: This involves generating adversarial examples during training and using them to augment the training data. This helps the model learn to be more robust to adversarial attacks. 2. Defensive distillation: This is a technique that involves training a second model to mimic the behavior of the original model. The second model is then used to make predictions, making it more difficult for an adversary to generate adversarial examples that can fool the model. 3. Feature squeezing: This involves converting the input data to a lower dimensionality, making it more difficult for an adversary to generate adversarial examples. 4. Gradient masking: This involves adding noise to the gradients during training to prevent an adversary from estimating the gradients accurately and generating adversarial examples. 5. Adversarial detection: This involves training a separate model to detect adversarial examples and reject them before they can be used to fool the main model. 6. Model compression: This involves reducing the complexity of the model, making it more difficult for an adversary to generate adversarial examples. In conclusion, improving the robustness of deep learning models against adversarial attacks is an active area of research. Researchers are continually developing new techniques and approaches to make deep learning models more resistant to adversarial attacks.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值