欺骗神经网络_可以欺骗神经网络吗？-CSDN博客

欺骗神经网络

Well not that I would like to mock AI more given we all may be slave to the mighty Skynet one day, I came across a method to generate an adversarial example using Fast Gradient Signed Method (FGSM) attack described in Explaining and Harnessing Adversarial Examples by Goodfellow et al. We are going to see the first and one of the most famous attacks.

远的不说，我想嘲笑AI更给予我们都可能是从威猛天网有一天，我碰到一个方法来使用快速梯度签名法(FGSM)攻击产生敌对例子说明解释和治理对抗式的例子由Goodfellow 等。我们将看到第一个也是最著名的攻击。

让我们从对抗示例开始。 (Let’s begin with Adversarial Example..)

Machine learning models, including state-of-the-art neural networks, are vulnerable to adversarial examples. That is, these machine learning models mis-classify examples that are only slightly different from correctly classified examples drawn from the data distribution.

机器学习模型(包括最新的神经网络)容易受到对抗性示例的攻击。也就是说，这些机器学习模型对示例进行了错误分类，这些示例与从数据分布中得出的正确分类的示例仅稍有不同。

These examples are created with a purpose for a network to misclassify. They are indistinguishable to Human eye, but a neural network fails to classify them. As you can see in the above figure, we started with a picture of labrador with a confidence of 41%. Upon adding some noise, our predictions moved to Saluki. These noise are known as perturbations. We are going to use Fast Gradient Signed Method (FGSM)

创建这些示例的目的是使网络分类错误。它们对于人眼是无法区分的，但是神经网络无法对它们进行分类。如上图所示，我们从拉布拉多的图片开始，置信度为41％。添加一些噪音后，我们的预测转移到了Saluki。这些噪声被称为扰动。我们将使用快速渐变符号方法(FGSM)

但是首先，什么是快速梯度签名方法(FGSM)？ (But First, What is Fast Gradient Signed Method (FGSM)?)

The fast gradient sign method works by using the gradients of the neural network to create an adversarial example. For an input image, the method uses the gradients of the loss with respect to the input image to create a new image that maximizes the loss. This new image is called the adversarial image. we can summarize using the following:

快速梯度符号方法通过使用神经网络的梯度来创建对抗性示例而起作用。对于输入图像，该方法使用损失相对于输入图像的梯度来创建使损失最大化的新图像。该新图像称为对抗图像。我们可以使用以下内容进行总结：

Image for post — here x is the original image | epsilon is a variable to tune the perturbations | the equation is the gradient of the loss w.r.t. to image

We choose to take gradient w.r.t input as we need to create the image that maximizes the loss. We can see how much each pixel contributes to the loss and use to create perturbation to add to them. This is pretty fast, since we are not training our model. We are just using the gradient to create a new image and predicting again to see the result.

我们选择采用梯度wrt输入，因为我们需要创建最大化损耗的图像。我们可以看到每个像素对损耗有多大贡献，并使用它们来创建扰动来增加损耗。这非常快，因为我们没有训练模型。我们只是使用渐变来创建新图像，然后再次进行预测以查看结果。

因此，让我们开始尝试一下。 (So Let’s start and try fooling one.)

We will try to fool MobileNetV2 model, pretrained on ImageNet. Let’s move ahead with some important modules while implementing this. For source code you can refer here or here.

我们将尝试愚弄ImageNet上预训练的 MobileNetV2模型。在实现此功能时，让我们继续一些重要的模块。对于源代码，您可以在这里 或这里参考。

建立对抗形象 (Creating the Adversarial Image)

We will be needing perturbations to add to image to create one. To get perturbations, we will be using gradients. You can refer to the code below:

我们将需要扰动来添加到图像以创建一个。为了获得扰动，我们将使用渐变。您可以参考以下代码：

Implementing discussed strategy to get perturbations

实施讨论的策略以获取干扰

现在，我们将其添加并亲自查看结果 (Now Let’s add these and see for ourselves the result)

The below code is used to add the perturbations to images with various values of epsilon. Read through the code with comments.

以下代码用于将扰动添加到具有各种epsilon值的图像上。阅读带注释的代码。

We will be getting something as below after running the above piece of code. Remember this does not require training.

运行以上代码后，我们将获得如下内容。请记住，这不需要培训。

好吧，那很快… (Okay, That was quick…)

So this was easy and quick to learn and implement. Whole idea was to understand the concept of creating attacks. You can read through various attacks and implement them.

因此，这很容易快速学习和实施。整个想法是理解创建攻击的概念。您可以通读并实施各种攻击。

If you have any issues, feel free to drop us a mail or response and We will be glad to help you out.

如果您有任何问题，请随时给我们发送邮件或回复，我们将很高兴为您提供帮助。