神经网络算法是黑盒_在黑盒内窥视如何欺骗神经网络

本文探讨了神经网络作为黑盒模型的特点,并介绍了如何通过特定技巧来欺骗神经网络,内容来源于对原文的翻译,旨在理解神经网络的内在工作原理。
摘要由CSDN通过智能技术生成

神经网络算法是黑盒

Neural networks get a bad reputation for being black boxes. And while it certainly takes creativity to understand their decision making, they are really not as opaque as people would have you believe.

神经网络被誉为黑匣子,因此声誉不佳。 尽管理解他们的决策当然需要创造力,但实际上他们并没有人们想像的那么模糊。

In this tutorial, I’ll show you how to use backpropagation to change the input as to classify it as whatever you would like.

在本教程中,我将向您展示如何使用反向传播更改输入以将其分类为所需的内容。

Follow along using this colab.

继续使用此colab

(This work was co-written with Alfredo Canziani ahead of an upcoming video)

(此作品是在即将上映的视频之前与Alfredo Canziani共同撰写的)

人类像黑匣子 (Humans as black boxes)

Let’s consider the case of humans. If I show you the following input:

让我们考虑人类的情况。 如果我向您显示以下输入:

Image for post

there’s a good chance you have no idea whether this is a 5 or a 6. In fact, I believe that I could even make a case for convincing you that this might also be an 8.

还有你不知道这是否是一个5或6。其实一个很好的机会,我相信,我甚至可以充分的理由说服你,这也可能是8。

Now, if you asked a human what they would have to do to make something more into a 5 you might visually do something like this:

现在,如果您问一个人,他们将要做些什么才能将更多的东西变成5,那么您可能会在视觉上执行以下操作:

Image for post

And if I wanted you to make this more into an 8, you might do something like this:

如果我希望您将其设置为8,则可以执行以下操作:

Image for post

Now, the answer to this question is not easy to explain in a few if statements or by looking at a few coefficients (yes, I’m looking at you regression). Unfortunately, with certain types of inputs (images, sound, video, etc…) explainability certainly becomes much harder but not impossible.

现在,用几个if语句或查看几个系数不容易解释这个问题的答案(是的,我正在看您的回归)。 不幸的是,对于某些类型的输入(图像,声音,视频等),可解释性当然会变得更加困难, 但并非不可能

询问神经网络 (Asking the neural network)

How would a neural network answer the same questions I posed above? Well, to answer that, we can use gradient ascent to do exactly that.

神经网络将如何回答我上面提出的相同问题? 好吧,要回答这个问题,我们可以使用梯度上升来做到这一点。

Here’s how the neural network thinks we would need to modify the input to make it more into a 5.

这是神经网络认为我们需要修改输入以使其更多地变为5。

Image for post

There are two interesting results from this. First, the black areas are where the network things we need to remove pixel density from. Second, the yellow areas are where it thinks we need to add more pixel density.

有两个有趣的结果。 首先,黑色区域是我们需要从中移除像素密度的网络事物的地方。 其次,黄色区域是我们认为需要增加像素密度的地方。

We can take a step in that gradient direction by adding the gradients to the original image. We could of course repeat this procedure over and over again to eventually morph the input into the prediction we are hoping for.

通过将梯度添加到原始图像,我们可以朝该梯度方向迈出一步。 我们当然可以一遍又一遍地重复此过程,以最终将输入变形为我们希望的预测。

Image for post

You can see that the black patch at the bottom left of the image is very similar to what a human might think to do as well.

您会看到图像左下方的黑色补丁人类可能认为的相似

Image for post
Human adds black on the left corner. Network suggests the same
人类在左上角添加黑色。 网络提示相同

What about making the input look more like an 8? Here’s how the network thinks you would have to change the input.

如何使输入看起来更像8? 这是网络认为您必须更改输入的方式。

Image for post

The notable things, here again, are that there is a black mass at the bottom left and a bright mass around the middle. If we add this with the input we get the following result:

同样值得注意的是,左下方有一个黑色块,中间有一个明亮的块。 如果将其与输入相加,将得到以下结果:

Image for post

In this case, I’m not particularly convinced that we’ve turned this 5 into an 8. However, we’ve made less of a 5, and the argument to convince you this is an 8 would certainly be easier to win using the image on the right instead of the image on the left.

在这种情况下,我并不特别相信我们已经将这个5变成了8。但是,我们减少了5,而说服您这个是8的论点肯定会更容易使用右侧的图片,而不是左侧的图片。

渐变是您的指南 (Gradients are your guides)

In regression analysis, we look at coefficients to tell us about what we’ve learned. In a random forest, we can look at decision nodes.

在回归分析中,我们查看系数以告诉我们所学的知识。 在随机森林中,我们可以查看决策节点。

In neural networks, it comes down to how creative we are at using gradients. To classify this digit, we generated a distribution over possible predictions.

在神经网络中,这取决于我们在使用渐变时的创造力 。 为了对该数字进行分类,我们在可能的预测上生成了分布。

This is what we call the forward pass.

这就是我们所说的前传。

Image for post
During the forward pass we calculate a probability distribution over outputs
在前进过程中,我们计算输出的概率分布

In code it looks like this (follow along using this colab):

在代码中,它看起来像这样( 继续使用此colab ):

Image for post

Now imagine that we wanted to trick the network into predicting “5” for the input x. Then the way to do this is to give it an image (x), calculate the predictions for the image and then maximize the probablitity of predicting the label “5”.

现在想象一下,我们想诱使网络为输入x预测“ 5”。 然后,执行此操作的方法是为其提供图像(x),计算该图像的预测,然后最大化预测标签“ 5”的概率。

To do this we can use gradient ascent to calculate the gradients of a prediction at the 6th index (ie: label = 5) (p) with respect to the input x.

为此,我们可以使用梯度上升来计算相对于输入x的第6个索引(即label = 5)( p )的预测的梯度。

Image for post

To do this in code we feed the input x as a parameter to the neural network, pick the 6th prediction (because we have labels: 0, 1, 2, 3, 4 , 5, …) and the 6th index means label “5”.

为此,我们将输入x作为参数输入神经网络,选择第6个预测(因为我们有标签:0、1、2、3、4、5,…),第6个索引表示标签“ 5 ”。

Visually this looks like:

看起来像这样:

Image for post
Gradient of the prediction of a “5” with respect to the input.
相对于输入的“ 5”预测的梯度。

And in code:

并在代码中:

Image for post
When we call .backward() the process that happens can be visualized by the previous animation.
当我们调用.backward()时,发生的过程可以由上一个动画可视化。

Now that we calculated the gradients, we can visualize and plot them:

现在我们已经计算出梯度,我们可以可视化并绘制它们:

Image for post
Image for post

The above gradient looks like random noise because the network has not yet been trained… However, once we do train the network, the gradients will be more informative:

上面的梯度看起来像随机噪声,因为尚未对网络进行训练。但是,一旦我们对网络进行训练,这些梯度将提供更多信息:

Image for post

通过回调自动化 (Automating this via Callbacks)

This is a hugely helpful tool in helping illuminate what happens inside your network as it trains. In this case, we would want to automate this process so that it happens automatically in training.

这是一个非常有用的工具,有助于阐明网络在训练过程中发生的情况。 在这种情况下,我们希望使该过程自动化,以便在训练中自动发生。

For this, we’ll use PyTorch Lightning to implement our neural network:

为此,我们将使用PyTorch Lightning来实现我们的神经网络:

import torch
import torch.nn.functional as F
import pytorch_lightning as pl


class LitClassifier(pl.LightningModule):


    def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(28 * 28, 10)


    def forward(self, x):
        return torch.relu(self.l1(x.view(x.size(0), -1)))


    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = F.cross_entropy(y_hat, y)
        result = pl.TrainResult(loss)


        # enable the auto confused logit callback
        self.last_batch = batch
        self.last_logits = y_hat.detach()


        result.log('train_loss', loss, on_epoch=True)
        return result
        
    def validation_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = F.cross_entropy(y_hat, y)
        result = pl.EvalResult(checkpoint_on=loss)
        result.log('val_loss', loss)
        return result


    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.005)

The complicated code to automatically plot what we described here, can be abstracted out into a Callback in Lightning. A callback is a small program that is called at the parts of training you might care about.

可以自动绘制出此处描述内容的复杂代码,可以抽象为Lightning中的Callback。 回调是一个小程序,您可能会在培训的各个部分调用它。

In this case, when a training batch is processed, we want to generate these images in case some of the inputs are confused.

在这种情况下,当处理一个训练批处理时,如果某些输入被混淆,我们想生成这些图像。

import torch
from pytorch_lightning import Callback
from torch import nn




class ConfusedLogitCallback(Callback):


    def __init__(
            self,
            top_k,
            projection_factor=3,
            min_logit_value=5.0,
            logging_batch_interval=20,
            max_logit_difference=0.1
    ):
        super().__init__()
        self.top_k = top_k
        self.projection_factor = projection_factor
        self.max_logit_difference = max_logit_difference
        self.logging_batch_interval = logging_batch_interval
        self.min_logit_value = min_logit_value


    def on_train_batch_end(self, trainer, pl_module, batch, batch_idx, dataloader_idx):
        # show images only every 20 batches
        if (trainer.batch_idx + 1) % self.logging_batch_interval != 0:
            return


        # pick the last batch and logits
        x, y = batch
        try:
            logits = pl_module.last_logits
        except AttributeError as e:
            m = """please track the last_logits in the training_step like so:
                def training_step(...):
                    self.last_logits = your_logits
            """
            raise AttributeError(m)


        # only check when it has opinions (ie: the logit > 5)
        if logits.max() > self.min_logit_value:
            # pick the top two confused probs
            (values, idxs) = torch.topk(logits, k=2, dim=1)


            # care about only the ones that are at most eps close to each other
            eps = self.max_logit_difference
            mask = (values[:, 0] - values[:, 1]).abs() < eps


            if mask.sum() > 0:
                # pull out the ones we care about
                confusing_x = x[mask, ...]
                confusing_y = y[mask]


                mask_idxs = idxs[mask]


                pl_module.eval()
                self._plot(confusing_x, confusing_y, trainer, pl_module, mask_idxs)
                pl_module.train()


    def _plot(self, confusing_x, confusing_y, trainer, model, mask_idxs):
        from matplotlib import pyplot as plt


        confusing_x = confusing_x[:self.top_k]
        confusing_y = confusing_y[:self.top_k]


        x_param_a = nn.Parameter(confusing_x)
        x_param_b = nn.Parameter(confusing_x)


        batch_size, c, w, h = confusing_x.size()
        for logit_i, x_param in enumerate((x_param_a, x_param_b)):
            x_param = x_param.to(model.device)
            logits = model(x_param.view(batch_size, -1))
            logits[:, mask_idxs[:, logit_i]].sum().backward()


        # reshape grads
        grad_a = x_param_a.grad.view(batch_size, w, h)
        grad_b = x_param_b.grad.view(batch_size, w, h)


        for img_i in range(len(confusing_x)):
            x = confusing_x[img_i].squeeze(0).cpu()
            y = confusing_y[img_i].cpu()
            ga = grad_a[img_i].cpu()
            gb = grad_b[img_i].cpu()


            mask_idx = mask_idxs[img_i].cpu()


            fig, axarr = plt.subplots(nrows=2, ncols=3, figsize=(15, 10))
            self.__draw_sample(fig, axarr, 0, 0, x, f'True: {y}')
            self.__draw_sample(fig, axarr, 0, 1, ga, f'd{mask_idx[0]}-logit/dx')
            self.__draw_sample(fig, axarr, 0, 2, gb, f'd{mask_idx[1]}-logit/dx')
            self.__draw_sample(fig, axarr, 1, 1, ga * 2 + x, f'd{mask_idx[0]}-logit/dx')
            self.__draw_sample(fig, axarr, 1, 2, gb * 2 + x, f'd{mask_idx[1]}-logit/dx')


            trainer.logger.experiment.add_figure('confusing_imgs', fig, global_step=trainer.global_step)


    @staticmethod
    def __draw_sample(fig, axarr, row_idx, col_idx, img, title):
        im = axarr[row_idx, col_idx].imshow(img)
        fig.colorbar(im, ax=axarr[row_idx, col_idx])
        axarr[row_idx, col_idx].set_title(title, fontsize=20)

But… we’ve made it even easier with pytorch-lightning-bolts which you can simply install

但是…我们通过pytorch-lightning-bolts使其更加容易安装,您只需安装

pip install pytorch-lightning-bolts

and import the callback into your training code

并将回调导入到您的训练代码中

from pl_bolts.callbacks.vision import ConfusedLogitCallback


trainer = Trainer(callbacks=[ConfusedLogitCallback(1)])

放在一起 (Putting it all together)

Finally we can train our model and automatically generate images when logits are “confused”

最终,我们可以训练模型并在logits被“混淆”时自动生成图像

# data
dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor())
train, val = random_split(dataset, [55000, 5000])


# model
model = LitClassifier()


# attach callback
trainer = Trainer(callbacks=[ConfusedLogitCallback(1)])


# train!
trainer.fit(model, DataLoader(train, batch_size=64), DataLoader(val, batch_size=64))

and tensorboard will automatically generate images that look like this:

和tensorboard将自动生成如下图像:

Image for post
Image for post

摘要 (Summary)

Image for post

In summary: You learned how to look inside the blackbox using PyTorch, learned the intuition, wrote a callback in PyTorch Lightning and automatically got your Tensorboard instance to plot questionable predictions

简介:您学习了如何使用PyTorch在黑盒中查看内容,了解了直观知识,在PyTorch Lightning中编写了回调函数,并自动获取了Tensorboard实例以绘制可疑的预测

Try it yourself with PyTorch Lightning and PyTorch Lightning Bolts.

使用PyTorch LightningPyTorch Lightning Bolts自己尝试一下。

(This article was written ahead of an upcoming video where me (William) and Alfredo Canziani show you how to code this from scratch).

(本文是在即将上映的视频之前写的,我(威廉)和阿尔弗雷多·坎齐安向我展示了如何从头开始编写此代码。

翻译自: https://towardsdatascience.com/peering-inside-the-blackbox-how-to-trick-a-neural-network-757c90a88a73

神经网络算法是黑盒

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值