极速火箭网络助手怎么用_在检测火箭队方面,神经网络比灰烬更好吗? 如果是这样,如何?...

极速火箭网络助手怎么用

Our whole existence is a never-ending riddle. Are we the only ones in the universe? What’s the point of life? Is a neural network better than Ash at recognizing Team Rocket? The first two are non-trivial questions that keep many scientists and philosophers up at night. The last one, however, does not let me sleep. In this article, I’ll attempt to answer it.

我们的整个存在是一个永无止境的谜语。 我们是宇宙中唯一的人吗? 生活的重点是什么? 在识别Team Rocket方面,神经网络是否比Ash更好? 前两个是不平凡的问题,使许多科学家和哲学家无法入睡。 然而,最后一个并没有让我睡觉。 在本文中,我将尝试回答它。

These days I’ve taken some of my lockdown time to watch the first season of the Pokemon show (it is not like I need an excuse anyway). As I watched Ash and friends embark on their adventures capturing pocket monsters and becoming the very best (like no one ever was), I couldn’t help noticing that they never recognize Team Rocket when they wear any of their iconic costumes. I mean, come on people, Team Rocket is always there, on every step of your journey, and you’re telling me that you can’t notice them? That’s weird. But sure, that’s ok; Pokemon world rules, I guess.

这些天,我已经花了一些时间来观看《口袋妖怪》节目的第一季(无论如何我都不需要找借口)。 当我看到Ash和朋友们踏上他们的冒险之旅时,他们捕捉口袋里的怪物并成为最好的怪物(这是前所未有的),我不禁注意到他们穿着任何标志性服装时从未认出Team Rocket。 我的意思是,拜托大家,火箭队在您旅行的每一步都一直在那儿,您是在告诉我您看不到他们吗? 那真是怪了。 但是可以,没关系; 我猜想,口袋妖怪世界规则。

But then, I thought, “Hmmm, wait a second, what about a neural network? Could a neural network be better than Ash and crew at identifying Team Rocket?” Well, probably. But I don’t have much to do these days, so let’s see it in action. Besides, sometimes the journey is better than the destination.

但是然后,我想:“嗯,等等,神经网络呢? 在确定Team Rocket方面,神经网络会比Ash和工作人员更好吗?” 我们可能会。 但是这些天我没有什么事可做,所以让我们来看一下它的作用。 此外,有时旅途比目的地要好。

Before continuing, for those of you who have no idea who Team Rocket is, it is a trio — consisting of Jessie, James, and Meowth — who plays the main antagonists of the Pokemon anime. Its main goal is to steal Pikachu from Ash. For this project, I’m just considering Jessie and James.

在继续之前,对于那些不知道谁是Rocket Team的人来说,这是一个三人组合-由Jessie,James和Meowth组成-扮演Pokemon动漫的主要对手。 它的主要目标是从Ash偷走皮卡丘。 对于这个项目,我只考虑杰西和詹姆斯。

Image for post
Figure 1. Team Rocket: James, Jessie, and Meowth © 2020 Pokémon
图1.火箭队:詹姆斯,杰西和喵喵©2020Pokémon

In this article, I discuss the findings of my experiment in which I used two models to identify Team Rocket differently. The first one is an object detection model trained in Google Cloud AutoML to detect Jessie and James in an image. To use the model, I deployed it in a TensorFlow.js application. From there, we will be able to detect the nemesis team. The second model is a convolutional neural network (CNN) image classifier trained on TensorFlow to identify either Jessie or James.

在本文中,我讨论了我的实验结果,其中我使用了两种模型来不同地识别Team Rocket。 第一个是在Google Cloud AutoML中训练的对象检测模型,用于检测图像中的Jessie和James。 为了使用该模型,我将其部署在TensorFlow.js应用程序中。 从那里,我们将能够发现复仇者团队。 第二个模型是在TensorFlow上训练的卷积神经网络(CNN)图像分类器,以识别Jessie或James。

However, just predicting these targets is a bit boring. To make things more interesting, I also wanted to know why the image classifier predicts the way it does. In other words, I’m interested in seeing what the networks see and why the image classifier classifies an image as either Jessie or James. So, I extended the TensorFlow.js app to also use the image classifier to plot the activation maps of the CNN layers at prediction time. That way, we will be able to determine the features the network uses to compute its prediction.

但是,仅预测这些目标会有些无聊。 为了使事情变得更有趣,我还想知道为什么图像分类器可以预测它的方式。 换句话说,我有兴趣了解网络看到的内容以及图像分类器为何将图像分类为Jessie或James。 因此,我扩展了TensorFlow.js应用程序,使其还使用图像分类器在预测时绘制CNN层的激活图。 这样,我们将能够确定网络用来计算其预测的功能。

In the article, I will explain all the steps I took to train the models and build the web application. Let’s start.

在本文中,我将解释我训练模型和构建Web应用程序所采取的所有步骤。 开始吧。

Image for post
Figure 2. Truly villainous © 2020 Pokémon
图2.真正是恶棍©2020Pokémon

数据 (The Data)

The dataset I’m using for the problem is very, very small. It consists of 87 and 71 images of Jessie and James. To train the object detection model on Google Cloud AutoML, Google recommends at least 50 images per label, so I’m good. For the image classifier, I’m using TensorFlow’s ImageDataGenerator. This generator applies a transformation to the dataset on each epoch and trains with this newly transformed data. Therefore, I’m using way more than just 158 images.

我用于此问题的数据集非常小。 它由Jessie和James的87和71张图像组成。 要在Google Cloud AutoML上训练对象检测模型,Google建议每个标签至少包含50张图像,所以我很好。 对于图像分类器,我正在使用TensorFlow的ImageDataGenerator 。 该生成器将变换应用于每个时期的数据集,并使用此新变换的数据进行训练。 因此,我使用的不仅仅是158张图像。

Caveat: I fully understand that 158 images are a joke for such a problem and that the model’s performance won’t be impressive. However, keep in mind this is a fun and experimental project, not something I intend to publish on NeurIPS or ICLR. Besides, I didn’t want to spend hours looking for images of Team Rocket.

警告:我完全理解158张图像是解决此类问题的笑话,并且该模型的性能不会令人印象深刻。 但是,请记住,这是一个有趣且实验性的项目,而不是我打算在NeurIPS或ICLR上发布的项目。 此外,我不想花很多时间寻找Team Rocket的图片。

问题 (The Problem)

As I said, the main problem here is that Ash can’t recognize Team Rocket. Want to see the evidence? Check this out:

就像我说的那样,这里的主要问题是Ash无法识别Team Rocket。 想看证据吗? 看一下这个:

Even though they wave big “R” flags, Ash is completely clueless. Wow. What about this one?

即使他们挥舞着大的“ R”标志,Ash还是一无所知。 哇。 这个如何?

Yep, it’s Team Rocket. Lastly, this one:

是的,是Team Rocket。 最后,这个:

So, Ash isn’t the only one with the problem.

因此,Ash并不是唯一遇到问题的人。

目标检测模型 (The Object Detection Model)

The first of the two models I created is an object detector trained using Google Cloud’s AutoML Vision Object Detection. This service allows you to easily (and boringly) annotate an object detection dataset and train a model within a couple of clicks. The trained model can be exported and optimized for several inference engines such as TensorFlow.js, TensorFlow Lite, and TensorFlow.

我创建的两个模型中的第一个是使用Google Cloud的AutoML Vision Object Detection进行训练的对象检测器 。 此服务使您可以轻松(无聊)地注释对象检测数据集并在几次单击中训练模型。 经过训练的模型可以导出并针对多个推理引擎进行优化,例如TensorFlow.js,TensorFlow Lite和TensorFlow。

The service is not free, though. However, Google provides a one-time credit voucher that should be enough for a small model.

该服务不是免费的。 但是,Google提供了一张一次性信用券,对于一个小模型来说应该足够了。

创建并注释数据集 (Creating and annotating the dataset)

The first step needed before training an AutoML Object Detection model involves uploading the dataset and annotating it. To upload the data, access the dataset management page by searching for “dataset” in the Cloud Console search bar. Then, create a new dataset of type “object detection” (Figure 3) and upload the images to a bucket.

训练AutoML对象检测模型之前,所需的第一步涉及上载数据集并对其进行批注。 要上传数据,请通过在Cloud Console搜索栏中搜索“数据集”来访问数据集管理页面。 然后,创建一个类型为“对象检测”的新数据集(图3),并将图像上传到存储桶。

Image for post
Figure 3. Create a new dataset in AutoML
图3.在AutoML中创建一个新的数据集

Once uploaded and processed, go to the IMAGES tab to start the dreary process of annotating the dataset. In this context, annotating a dataset consists of drawing the bounding boxes (Figure 4) of the target object(s) on top of the image — not that fun.

上传和处理完后,转到“图像”选项卡以开始对数据集进行注释的沉闷过程。 在这种情况下,对数据集进行注释包括在图像顶部绘制目标对象的边界框(图4),这并不是那么有趣。

Image for post
Figure 4. Repeat this process 100+ times
图4.重复此过程100多次

With the dataset labeled, click the TRAIN tab to train the model.

在标记了数据集的情况下,单击“训练”选项卡以训练模型。

训练模型 (Training the model)

Training the model is done within three clicks. From the current screen, go to the TRAIN tab and click on TRAIN NEW MODEL. Then name the model, select “edge” (because we want to download it), choose the optimization target (I’m using higher accuracy) and the node hour budget, followed by clicking START TRAINING. Using the default node hour budget, and with my dataset’s size, the training took around four hours.

只需单击三下即可完成模型的训练。 在当前屏幕上,转到“培训”选项卡,然后单击“培训新模型”。 然后命名模型,选择“ edge”(因为我们要下载它),选择优化目标(我使用的是更高的精度)和节点小时预算,然后单击“ START TRAINING”。 使用默认的节点小时预算以及我的数据集的大小,培训大约花费了四个小时。

Image for post
Figure 5. Define and train a model
图5.定义和训练模型

评估模型 (Evaluating the model)

The training has ended; welcome back. To evaluate the model’s metrics go to the EVALUATE tab. There you will find the precision and recall score and the option to calculate them at different threshold levels. My model achieved precision and recall of 100% and 93.75%, respectively (Figure 6). However, in practice, as we will soon see, the model is a bit worse. But that’s ok! Considering the size of the training set, I wouldn’t expect the model to be that great.

培训已经结束; 欢迎回来。 要评估模型的指标,请转到“评估”选项卡。 在那里,您会发现精度和召回得分以及在不同阈值水平下计算它们的选项。 我的模型的准确度和召回率分别为100%和93.75%(图6)。 但是,实际上,正如我们将很快看到的那样,该模型要差一些。 但是没关系! 考虑到训练集的大小,我不希望模型这么大。

Image for post
Figure 6. Model’s scores.
图6.模型得分。

As for the last step, go to the TEST & USE tab to export the TensorFlow.js model (Figure 7). The exported directory should have a model.json (the model topology) and dict.txt files (the labels), and several .bin files (the weights). Now, let’s build an app around the model.

对于最后一步,转到TEST&USE选项卡以导出TensorFlow.js模型(图7)。 导出的目录应具有model.json (模型拓扑)和dict.txt文件(标签),以及几个.bin文件(权重)。 现在,让我们围绕模型构建一个应用程序。

Image for post
Figure 7. Export the model
图7.导出模型

在TensorFlow.js中部署模型 (Deploying the model in TensorFlow.js)

To detect Team Rocket using the object detection model and to run the image classification model to present the activation maps, I built a web app that uses TensorFlow.js to load the models and predict with them. In this part, I’ll show how I loaded the object detection model, predicted with it, and drew the bounding boxes. Let’s start with the HTML:

为了使用对象检测模型检测Team Rocket并运行图像分类模型以显示激活图,我构建了一个Web应用程序,该应用程序使用TensorFlow.js加载模型并对其进行预测。 在这一部分中,我将展示如何加载对象检测模型,对其进行预测并绘制边界框。 让我们从HTML开始:

In the head tags, I’m loading TensorFlow.js and AutoML Edge API, a package that loads and runs models produced with AutoML Edge. In the body, there is an input element used by the user to upload an image and two image tags. One displays the original image, and the second one displays the image with the bounding boxes. Then we call the JS script:

在head标签中,我正在加载TensorFlow.js和AutoML Edge API ,这是一个用于加载和运行由AutoML Edge生成的模型的软件包。 在主体中,有一个输入元素供用户用来上传图像和两个图像标签。 一个显示原始图像,第二个显示带有边框的图像。 然后我们调用JS脚本:

At the top of the script, we declare the model variable and its options. The options object specifies the threshold, Intersect over Union (IoU) threshold, and the max number of objects to return. Following it is the context of the canvas where we’ll draw the detections, the size of the output image, and the color the detection overlay.

在脚本的顶部,我们声明模型变量及其选项。 options对象指定阈值,“ 联合相交”(IoU)阈值以及要返回的最大对象数。 接下来是画布的上下文,我们将在其中绘制检测,输出图像的大小以及检测覆盖层的颜色。

The first function, setupODCanvas(), sets up the object detection canvas. The second one, drawBoundingBoxes(), is responsible for drawing the bounding boxes. Then comes processInput(), the function that predicts. Here we are using the onchange event of the image input element to predict when the user selects an image. Once triggered, we get the image and use it as an argument to odModel.detect(), the method that detects the objects. After detecting, we draw the image and the bounding boxes on the canvas.

第一个函数setupODCanvas()设置对象检测画布。 第二个drawBoundingBoxes()负责绘制边界框。 然后是processInput() ,即预测的函数。 在这里,我们使用图像输入元素的onchange事件来预测用户何时选择图像。 触发后,我们将获取图像并将其用作odModel.detect()的参数, odModel.detect()是检测对象的方法。 检测后,我们在画布上绘制图像和边框。

To run the web app, start a local web server in the project’s root directory. You can easily create one using $ python3 -m http.server or with http-server, a command-line tool for creating HTTP servers. After starting the server, go to the address shown by the server, e.g., http://127.0.0.1:8080/, to access the app. Note that in the code, I'm using the port 8080.

要运行Web应用程序,请在项目的根目录中启动本地Web服务器。 您可以使用$ python3 -m http.server或用于创建HTTP服务器的命令行工具http-server轻松地创建一个。 启动服务器后,转到服务器显示的地址,例如http://127.0.0.1:8080/ ,以访问该应用程序。 请注意,在代码中,我正在使用端口8080。

该模型能够检测“火箭队”吗? (Is the model capable of detecting Team Rocket?)

Empirically speaking, yes! It’s capable of detecting Team Rocket better than Ash. But, like Ash, it also fails every now and then. Let’s see. Below are some detections from scenes of the clips presented above.

从经验上讲,是的! 它比Ash能够更好地检测Team Rocket。 但是,像灰烬一样,它有时也会失败。 让我们来看看。 以下是从上面呈现的剪辑场景中得出的一些检测结果。

Image for post
Figure 8. Detected Jessie
图8.检测到的杰西
Image for post
Figure 9. Detected James
图9.检测到的詹姆斯

Good job, neural network! Those are indeed Jessie and James. The following ones are from the second video.

干得好,神经网络! 的确是杰西和詹姆斯。 以下是第二段视频。

Image for post
Figure 10. Another detected Jessie
图10.另一个检测到的杰西
Image for post
Figure 11. Another detected James
图11.另一个检测到的詹姆斯

Again, success.

再次,成功。

Image for post
Figure 12. Jessie!
图12.杰西!
Image for post
Figure 13. James!
图13.詹姆斯!

These were some of the positive cases where the model worked as planned. But unfortunately, there were others where it failed. I noticed that the model didn’t perform well (at least with the same confidence threshold used in the others) in situations where Team Rocket looks too goofy, the images lack details, and where they are seen from far. For example:

这些是模型按计划工作的一些积极案例。 但不幸的是,还有其他失败的地方。 我注意到,在Rocket Team看上去太笨拙,图像缺少细节以及无法看到远处的情况下,该模型的效果不佳(至少与其他模型具有相同的置信度阈值)。 例如:

Image for post
Figure 14. A non-detected Jessie
图14.未检测到的杰西
Image for post
Figure 15. A non-detected James
图15.一个未被发现的詹姆斯

The network has issues detecting them in images that do not display their most distinguishable feature: the hair color. For example,

网络在无法显示最明显特征的图像中检测它们时存在问题:头发的颜色。 例如,

Image for post
Figure 16. That’s Jessie, not James.
图16.那是Jessie,不是James。

In the image above, you barely see Jessie’s hair. Honestly, if I didn’t know the context, I wouldn’t be able to tell that’s her either. It’s similar in cases where they’ve dyed their hair:

在上图中,您几乎看不到杰西的头发。 老实说,如果我不知道具体情况,我也不会说是她。 他们染头发的情况也是如此:

Image for post
Figure 17. That’s not an easy one, anyway.
图17.无论如何,这都不容易。

The last of the issues I want to mention is that in none of my tests, the network was able to detect both members in one picture. Instead, it detects both of them as one label. This inconvenience, alongside the false positives, is, in my opinion, the biggest flaw of the experiment. I hope to fix it after adding more training data.

我要提到的最后一个问题是,在我的所有测试中,网络都无法检测到一张图片中的两个成员。 而是将它们都检测为一个标签。 在我看来,这种不便和误报是实验的最大缺陷。 我希望在添加更多培训数据后对其进行修复。

Image for post
Figure 18. Technically, it is somehow correct.
图18.从技术上讲,它是正确的。

So, to summarize, is a neural network better than Ash at detecting Team Rocket? I say yes. Now comes a follow-up question. How is the network identifying them? What does it see before saying, “yes, this is Jessie”? In the second part of the experiment, I’m addressing this.

因此,总而言之,在检测Team Rocket方面,神经网络是否比Ash更好? 我说是。 现在是一个后续问题。 网络如何识别它们? 在说“是的,这是杰西”之前看到了什么? 在实验的第二部分,我正在解决这个问题。

解释激活图 (Interpreting the Activation Maps)

In layman’s terms, a convolutional neural network (CNN) learns to discern a category by looking at its most prominent visual features. For example, a CNN might learn that a banana is a banana because of its banana-ish shape and yellow color. For the second part of the experiment, I wanted to study the visual features a CNN extracts from an image of Jessie and James. In other words, I’m interested to see the network’s activation maps, which are the outputs of its convolutional and max pooling layers.

用外行的术语来说,卷积神经网络(CNN)通过观察类别最突出的视觉特征来学习辨别类别。 例如,CNN可能会因为香蕉形状和黄色而得知香蕉是香蕉。 对于实验的第二部分,我想研究CNN从Jessie和James的图像中提取的视觉特征。 换句话说,我很想看到网络的激活图,它们是其卷积层和最大池化层的输出。

To achieve this goal, I trained in TensorFlow, a CNN image classifier that predicts if an image is of Jessie or James. That way, I’ll know why the network believes this is either Jessie or James. With the network trained, I expanded the previous TensorFlow.js app to load the model and plot on screen the activation maps of all the layers and each filter. It looks like this:

为了实现这一目标,我在TensorFlow中进行了训练,TensorFlow是一种CNN图像分类器,可以预测图像是Jessie还是James。 这样,我就知道为什么网络会认为这是杰西还是詹姆斯。 在训练好网络之后,我扩展了之前的TensorFlow.js应用程序以加载模型并在屏幕上绘制所有层和每个过滤器的激活图。 看起来像这样:

Image for post
Figure 19. The classification output and object detection part of the app
图19.应用程序的分类输出和对象检测部分
Image for post
Figure 20. The activation map of the filter 11 of the first layer
图20.第一层过滤器11的激活图

训练模型 (Training the model)

The image classifier model is a pretty standard CNN. It has three convolutional layers, three max pooling layers, one dropout layer, a dense layer with the ReLU activation function, and as for the last one, another dense layer of two units and softmax activation function. This last bit means that the output is a vector of length 2, where each value is the likelihood of the image being of James (label 0) or Jessie (label 1). Below is a diagram of the model, followed by its summary.

图像分类器模型是一个非常标准的CNN。 它具有三层卷积层,三层最大池化层,一个辍学层,一个具有ReLU激活功能的密集层,至于最后一层,另一个具有两个单元和softmax激活功能的密集层。 最后一位表示输出是长度为2的向量,其中每个值都是图像是James(标签0)或Jessie(标签1)的可能性。 下面是该模型的示意图,然后是其摘要。

Image for post
NN-SVG NN-SVG创建的图
Image for post
Figure 22. The CNN summary
图22. CNN摘要

Unlike the previous model, for training this one, I have a richer dataset. Well, kind of: Instead of using the 158 images, I’m using TensorFlow’s image generator to augment my dataset by transforming the existing dataset. The transformations I’m applying rotate the image between [-45, 45] degrees, shifts it horizontally and vertically, flips it horizontally, and zooms it. Moreover, 20% of the generated images are used as the validation set. To evaluate the model, I’m using TensorBoard. Below is the complete training script:

与以前的模型不同,要训练该模型,我拥有更丰富的数据集。 好吧,有点:我不是使用158张图像,而是使用TensorFlow的图像生成器通过转换现有数据集来扩充我的数据集。 我正在应用的转换将图像旋转[-45,45]度,在水平和垂直方向上进行移动,在水平方向上进行翻转,然后对其进行缩放。 此外,将20%的生成图像用作验证集。 为了评估模型,我使用了TensorBoard。 以下是完整的培训脚本:

The first function, create_data_generators(), creates the generators (of course it does that, hahaha). train(), whose parameters are both generators, trains the model. Note that we are using the TensorBoard callback on model.fit() to log the model's training information. Once the training finishes, we save the model.

第一个函数create_data_generators()创建了生成器(当然可以了,哈哈哈)。 两个参数都是生成器的train()训练模型。 请注意,我们在model.fit()上使用TensorBoard回调来记录模型的训练信息。 训练完成后,我们将保存模型。

Regarding the model’s performance, after 25 epochs, it achieved a training accuracy and loss value of 0.8594 and 0.3367, respectively, and validation accuracy and loss value of 0.8065 and 0.2541. Below are two screenshots from TensorBoard presenting the accuracy and loss values.

关于模型的性能,经过25个时期后,它的训练精度和损失值分别为0.8594和0.3367,验证精度和损失值分别为0.8065和0.2541。 以下是TensorBoard的两个屏幕截图,显示了准确性和损耗值。

Image for post
Figure 23: The training and validation accuracy.
图23:训练和验证的准确性。
Image for post
Figure 24: The training and validation loss
图24:训练和验证损失

To launch TensorBoard, execute in the terminal:

要启动TensorBoard,请在终端中执行:

$ tensorboard --logdir /tmp/tensorboard

Make sure the given path is the same one used in the TensorBoard callback used in model.fit().

确保给定路径与model.fit()使用的TensorBoard回调中使用的路径相同。

将模型转换为TensorFlow.js模型 (Converting the model to a TensorFlow.js model)

At the end of the training, we save the model to disk. However, this format won’t work in TensorFlow.js. To use the model in our web app, we must first convert it to TensorFlow.js format; it’s easier than it sounds. To convert it, we need the TensorFlow.js Converter tool. After installing it, execute the following command to produce the TensorFlow.js version of the model:

训练结束时,我们将模型保存到磁盘。 但是,这种格式在TensorFlow.js中不起作用。 要在我们的Web应用程序中使用该模型,我们必须首先将其转换为TensorFlow.js格式; 比听起来容易。 要转换它,我们需要TensorFlow.js Converter工具。 安装后,执行以下命令以生成该模型的TensorFlow.js版本:

$ tensorflowjs_converter --input_format=keras_saved_model PATH/TO/TS_MODEL OUTPUT/PATH

TensorFlow.js激活网络应用 (The TensorFlow.js activation web app)

The activation map web app we will see here is an extension of the previous one. Now, besides detecting Team Rocket, it will classify the image between “Jessie” or “James” and present the layer’s activation maps.

我们将在此处看到的激活图Web应用程序是前一个的扩展。 现在,除了检测Team Rocket之外,它还将在“ Jessie”或“ James”之间对图像进行分类并显示图层的激活图。

You can find the app at https://juandes.github.io/team-rocket-activations-app/index.html.

您可以在https://juandes.github.io/team-rocket-activations-app/index.html上找到该应用程序。

Figure 20 is a screenshot of the app. There you can see the image input control, the object detection canvas, the prediction outcome, and one section per layer, where the user uses an input slider to select the filter it wants to visualize. How did I build the app? Time for some code, starting with the HTML:

图20是该应用程序的屏幕截图。 在那里,您可以看到图像输入控件,对象检测画布,预测结果以及每层一个部分,用户可以在其中使用输入滑块选择要可视化的过滤器。 我如何构建应用程序? 从HTML开始一些代码的时间:

For this version of the app, I'm using the visualization library Plotly to visualize the activation maps. Also, we have a CSS file you can find in the repo. In the body, there are two <p>s that show the prediction label and output and six <div>s (one per layer) with an input of type range used to select the filter we wish to plot—the max value is the number of filters the layer has. Each of these <div>s have another <div> where the script programmatically adds the Plotly graph.

对于此版本的应用程序,我正在使用可视化库Plotly来可视化激活图。 另外,我们有一个CSS文件,您可以在存储库中找到它。 在主体中,有两个<p>显示预测标签和输出,还有六个<div> s(每层一个),其输入类型范围用于选择我们希望绘制的过滤器-最大值是图层具有的过滤器数。 这些<div>的每一个都有另一个<div> ,其中脚本以编程方式添加了Plotly图。

That’s the HTML. As for the JS script, before jumping into the code, let me quickly explain how it works. The script uses three models: the object detector, the image classifier, and a new model. This new model outputs a tensor made of all the intermediate activation maps. We will create this model by explicitly setting its input to the input of the image classifier and as output, a list of the classifier’s output tensors (the convolutional and pooling layers). To trigger the predictions, the user first has to upload an image. After that, we will present the detected Rocket, the classification output, and the activation maps on the app. Now, let’s see the code part by part:

那就是HTML。 至于JS脚本,在进入代码之前,让我快速解释一下它是如何工作的。 该脚本使用三种模型:对象检测器,图像分类器和新模型。 这个新模型输出一个由所有中间激活图构成的张量。 我们将通过将其输入显式设置为图像分类器的输入,并将分类器的输出张量列表(卷积和池化层)作为输出来创建此模型。 为了触发预测,用户首先必须上传图像。 之后,我们将在应用程序上显示检测到的Rocket,分类输出和激活图。 现在,让我们看部分代码:

Above we have the models’ variables: the image classifier, the model we will use to get the activations, and the object detector. Following them is the variable that holds the image we want to classify, the canvas context, and a list of maps that describe some information about the layers.

在上方,我们有模型的变量:图像分类器,将用于获取激活的模型和对象检测器。 它们后面是保存我们要分类的图像的变量,画布上下文以及描述有关图层的一些信息的地图列表。

The first two functions I want to present are setupODCanvas() (from before) and initSliders(), used for initializing the sliders' input. initSliders()'s second parameter is a callback that’s called when the user moves the slider:

我要介绍的前两个函数是setupODCanvas() (从前)和initSliders() ,它们用于初始化滑块的输入。 initSliders()的第二个参数是在用户移动滑块时调用的回调:

Then comes drawActivation(), the one that draws the activation maps in the Plotly plot. Its parameters are all the "predicted" activations, the index of the layer whose filter we want to draw, the filter index, the id of the plot's <div>, and the size of the filter. Inside the function, we use tf.tidy(fn), a TensorFlow.js function that cleans up all the intermediate tensors allocated by fn. In tf.tidy(), we get the activation tensor and filter indicated by layerIndex and filterNumber. Then, we transform the activation map into a 2D array and plot it as a heatmap:

然后是drawActivation() ,它在Plotly图中绘制激活图。 它的参数是所有“预测的”激活,我们要绘制其过滤器的层的索引,过滤器索引,图的<div>的id以及过滤器的大小。 在函数内部,我们使用tf.tidy(fn) ,这是一个TensorFlow.js 函数 ,它清除了fn分配的所有中间张量。 在tf.tidy() ,我们获得激活张量和filter(由layerIndexfilterNumber指示)。 然后,我们将激活图转换为2D数组并将其绘制为热图:

After it is setupSliders(). This one iterates over layersInformation and calls initSliders() using the values from layersInformation. The second parameter of initSliders() is the callback that triggers when the person uses the slider to select a filter of the layer. In that event, we will produce the activations maps using activationModel. After prediction, we call drawActivation():

之后是setupSliders() 。 这layersInformationlayersInformation迭代,并使用layersInformation的值调用initSliders()initSliders()的第二个参数是当用户使用滑块选择图层的过滤器时触发的回调。 在这种情况下,我们将使用activationModel生成激活图。 预测之后,我们调用drawActivation()

Next is setupModels(), responsible for initializing the three models we're using. The function starts by loading the classifier and object detection models. After that, it iterates over the first six layers of the CNN (the three convolutional and max pooling) and adds their outputs to a list. We then create a new model, named activationModel, using TensorFlow's functional approach in which we have to specify the model's inputs and outputs:

接下来是setupModels() ,负责初始化我们正在使用的三个模型。 该功能通过加载分类器和对象检测模型开始。 之后,它会在CNN的前六层(三个卷积池和最大池)上进行迭代,并将其输出添加到列表中。 然后,我们使用TensorFlow的功能方法创建一个名为activationModel的新模型,其中我们必须指定模型的输入和输出:

The following function is the same drawBoundingBoxes() from before:

以下功能与之前的drawBoundingBoxes()相同:

Now comes the first function that predicts. This one, named predictWithObjectDetector(), uses the input image to detect Team Rocket just like we did before:

现在出现第一个可预测的功能。 这个名为predictWithObjectDetector()使用输入图像来检测Team Rocket,就像我们之前所做的那样:

The next function, named predictWithClassifier(), gets the input image, converts it to a tensor, and predicts its label, and activations. After predicting, the function draws the activation maps of the first filter from every layer and adds the prediction outcome to the HTML. This function runs when a user uploads an image. Unlike the prediction we did on the callback from initSliders(), where the drew the activations of the filter selected by the user, here we will visualize the first filter of each layer:

下一个名为predictWithClassifier()函数获取输入图像,将其转换为张量,并预测其标签和激活。 预测之后,该函数从每个图层绘制第一个过滤器的激活图,并将预测结果添加到HTML。 用户上载图像时,将运行此功能。 与我们对initSliders()的回调所做的预测不同,在initSliders() ,它激活了用户选择的过滤器,在这里,我们将可视化每一层的第一个过滤器:

Then comes the function that glues everything together: processInput(). This function uses several events to load the image selected by the user, draws it on screen, and call the previous predict functions:

然后是将所有内容粘合在一起的函数: processInput() 。 此函数使用多个事件来加载用户选择的图像,在屏幕上绘制图像,并调用先前的预测函数:

Last, there’s an init function which serves as the app’s starting point.

最后,有一个init函数用作应用程序的起点。

That’s for the code. To run the app, follow the approach previously discussed.

那是为了代码。 要运行该应用程序,请遵循前面讨论的方法。

模型如何识别团队火箭? (How Is the Model Identifying Team Rocket?)

After that nice TensorFlow.js and JS lesson, it’s time to answer our second question: How is the model identifying Team Rocket? What does it see at prediction time? To answer that, I’ll present several activation maps, starting with the following image:

在愉快的TensorFlow.js和JS课程结束之后,是时候回答我们的第二个问题:模型如何识别Team Rocket? 在预测时会看到什么? 为了回答这个问题,我将从下图开始介绍几个激活图:

Image for post
Figure 25. More Jessie and a bit of Meowth © 2020 Pokémon
图25.更多Jessie和一点Meowth©2020Pokémon

The CNN says this is Jessie — 0.99 Jessie and 0.01 James, according to the softmax output. But why is it her? Well, see for yourself.

CNN说,根据softmax的输出,这是杰西-0.99杰西和0.01詹姆斯。 但是为什么是她? 好吧,自己看看。

Image for post
Figure 26. Activation maps
图26.激活图

Figure 26 presents the activations maps of some filters from the six layers. The first two images come from the first convolutional and max pooling layers. The two that follow are from the second set of convolutional and max pooling, and the last two, from the third set — since each consecutive layer is smaller than the previous one, the resolution of the images decreases. On these heatmaps, the color’s intensity represents the regions the CNN uses to identify the object. And that’s why it is not surprising that Jessie’s hair, arguably her most icon feature, stands out.

图26展示了来自六层的一些过滤器的激活图。 前两个图像来自第一个卷积层和最大池化层。 接下来的两个来自第二组卷积和最大池,最后两个来自第三组-由于每个连续层都小于前一个,因此图像的分辨率降低。 在这些热图上,颜色的强度代表CNN用来识别物体的区域。 这就是为什么杰西的头发(可以说是她最具标志性的特征)脱颖而出也就不足为奇了。

These images, though, do not tell the complete story of the network. They are just a small sample from the 100+ filters the model has. While most of the maps focus on the hair, others look at other parts:

但是,这些图像并不能说明网络的全部情况。 它们只是该模型具有100多个过滤器的一小部分。 虽然大多数贴图集中在头发上,但其他贴图则关注其他部分:

Image for post
Figure 27. Meowth is indeed a nice feature
图27. Meowth确实是一个不错的功能

For the next example, consider this image of James:

对于下一个示例,请考虑以下James图像:

Image for post
Figure 28. James and his awesome fake beard © 2020 Pokémon
图28.詹姆斯和他的假胡须©2020Pokémon

The CNN correctly predicted “James” (Jessie 0.20, James 0.80). Like before, the network is also mostly focusing on the hair and, in some cases, on that majestic fake beard.

CNN正确预测“詹姆斯”(杰西0.20,詹姆斯0.80)。 像以前一样,网络也主要关注头发,在某些情况下还关注那双雄伟的假胡须。

Image for post
Figure 29. James’ activation map.
图29. James的激活图。

Nothing is perfect, and that includes my network. During my tests, I found some cases where the CNN prediction was wrong. One example is the same picture of Jessie that the object detector couldn’t detect — we could say that’s because the hair is hardly present.

没有什么是完美的,这包括我的网络。 在测试期间,我发现了一些CNN预测错误的情况。 一个例子是对象检测器无法检测到的同一张Jessie图片-我们可以说这是因为几乎没有头发。

Image for post
Figure 30. Wrong again
图30.再次错误

Here are the activation maps. There’s no distinguishable feature of Jessie being identified.

这是激活图。 没有识别Jessie的明显特征。

Image for post
Figure 31. No hair found.
图31.没有发现头发。

With that, we conclude the experiment! To see more examples, I’ll invite you to check out the interactive demo presented above or to clone the repo and run it yourself.

这样,我们结束了实验! 要查看更多示例,我将邀请您查看上面介绍的交互式演示或克隆存储库并自行运行。

结论与总结 (Conclusion and Recap)

Life is full of questions. Who am I? Was it the chicken or the egg? Is a neural network better than Ash at detecting Team Rocket? In this experiment, I attempted to answer the latest. The results were satisfactory, and I’m willing to answer the question with a “yes.” For the project, I trained two networks, an object detector, and an image classifier to detect Team Rocket in an image and to learn about the features a CNN sees as relevant. To use the models, we created a web app using TensorFlow.js that loads them and presents the detected Rocket and the activation maps.

生活充满了疑问。 我是谁? 是鸡肉还是鸡蛋? 在检测Team Rocket方面,神经网络是否比Ash更好? 在此实验中,我尝试回答最新的问题。 结果令人满意,我愿意回答“是”。 对于该项目,我训练了两个网络,一个对象检测器和一个图像分类器,以检测图像中的Team Rocket并了解CNN认为相关的功能。 为了使用模型,我们使用TensorFlow.js创建了一个Web应用程序,将其加载并显示检测到的Rocket和激活图。

To my dear friend Ash Ketchum, these are the tips I have for you. As we learned from the object detector model, if you are ever in the situation where you need to know if the group in front of you is Team Rocket, try to analyze each person at a time; don’t look at them as a group. Second of all, try to get a bit closer to them — it helps a lot. Lastly, and this is not a huge breakthrough, focus on the hair. If it is bright red and long, the chances are that person is Jessie. Or, if short and light blue, it is probably James.

对于我亲爱的朋友阿什·凯彻姆(Ash Ketchum),这些是我为您准备的窍门。 正如我们从物体检测器模型中学到的,如果您曾经需要知道前面的团队是否是Rocket团队,请尝试一次分析每个人; 不要将它们视为一个整体。 其次,尝试与他们更接近一些-这很有帮助。 最后,这不是一个巨大的突破,专注于头发。 如果它是鲜红色且很长,则该人很可能是杰西。 或者,如果是浅蓝色,可能是詹姆斯。

For a future iteration of the project, I’d like to correct the object detector to detect the two villains in the same frame. Similarly, I want to train a multi-label CNN capable of identifying both Jessie and James in one image.

对于项目的未来迭代,我想更正对象检测器,以检测同一帧中的两个恶棍。 同样,我想训练一种多标签CNN,它可以在一幅图像中识别Jessie和James。

And that’s it! You can find the complete source code at https://github.com/juandes/team-rocket-activations-app and a running version of the app at https://juandes.github.io/team-rocket-activations-app/index.html. You can find images suitable for testing the models in the data/test/ directory.

就是这样! 你可以找到完整的源代码https://github.com/juandes/team-rocket-activations-app在和应用程序的运行版本https://juandes.github.io/team-rocket-activations-app /index.html 。 您可以在data / test /目录中找到适合测试模型的图像。

Thanks for reading.

谢谢阅读。

If you have any questions, comments, or existential crisis, leave a comment here.

如果您有任何疑问,评论或存在危机,请在此处发表评论。

Image for post
Team Rocket’s blasting off agaaaaaain!
火箭队正在爆炸!

To err on the side of caution: Pokémon and all respective names are trademarks and © of Nintendo 1996–2020.

为了谨慎起见,请注意:神奇宝贝和所有各自的名称均为Nintendo 1996-2020的商标。

翻译自: https://medium.com/better-programming/is-a-neural-network-better-than-ash-at-detecting-team-rocket-if-so-how-c00108c64174

极速火箭网络助手怎么用

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值