opencv keras_如何使用keras和opencv构建武器检测系统

opencv keras

I recently completed a project I am very proud of and figured I should share it in case anyone else is interested in implementing something similar to their specific needs. Before I get started in the tutorial, I want to give a HEFTY thanks to Adrian Rosebrock, PhD, creator of PyImageSearch. I am a self-taught programmer, so without his resources, much of this project would not be possible. He is the epitome of a mensch- I could not be more appreciative of the resources he puts on his website. If you want to learn advanced deep learning techniques but find textbooks and research papers dull, I highly recommend visiting his website linked above.

我最近完成了一个我感到非常自豪的项目,并认为如果有其他人有兴趣实施类似于其特定需求的东西,我应该分享它。 在开始学习本教程之前,我想感谢PyImageSearch的创建者Adrian Rosebrock博士。 我是一个自学成才的程序员,所以如果没有他的资源,这个项目的大部分将无法实现。 他是一个月经的缩影。我对他在他的网站上放的资源一无所知 。 如果您想学习高级深度学习技术,但发现课本和研究论文乏味,我强烈建议您访问上面链接的他的网站。

In most projects related to weapon classification, I was only able to find a dataset of 100–200 images maximum. This posed an issue because, from my experience, it is hard to get a working model with so little images. To gather images, I rigged my raspberry pi to scrape IMFDB.com- a website where gun enthusiasts post pictures where a model gun is featured in a frame or clip from a movie. If you visit the website, this will be more clear. To access the images that I used, you can visit my Google Drive. In this zip file, you will find all the images that were used in this project and the corresponding .xml files for the bounding boxes. If you are in need of bounding boxes for a large dataset, I highly recommend ScaleOps.AI, a company that specializes in data labeling for machine learning algorithms. Currently, I have 120,000 images from the IMFDB website, but for this project, I only used ~5000 due to time and money constraints.

在大多数与武器分类有关的项目中,我只能找到最多100–200张图像的数据集。 这带来了一个问题,因为根据我的经验,很难获得只有这么少图像的工作模型。 为了收集图像,我将树莓派装扮成刮擦IMFDB.com的网站。IMFDB.com是一个网站,枪炮发烧友会在该网站上张贴图片,其中在电影的画面或剪辑中装有模型枪。 如果您访问该网站,则将更加清楚。 要访问我使用的图像,您可以访问我的Google云端硬盘 。 在此zip文件中,您将找到此项目中使用的所有图像以及边界框的相应.xml文件。 如果您需要大型数据集的边界框,我强烈推荐ScaleOps.AI ,这是一家专门为机器学习算法提供数据标签的公司。 目前,我有来自IMFDB网站的120,000张图像,但是由于时间和金钱的限制,我仅使用了约5000张图像。

Now, let’s get to the logic. The architecture of this project follows the logic shown on this website. Although we implement the logic here, there are many areas for which it is different so that it can be useful for our specific problem — detecting weapons. The project uses 6 basic steps:

现在,让我们开始逻辑。 该项目的体系结构遵循网站上显示的逻辑。 尽管我们在这里实现了逻辑,但是在很多地方它都有所不同,因此对于我们的特定问题(检测武器)很有用。 该项目使用6个基本步骤:

  1. Build a dataset using OpenCV Selective search segmentation

    使用OpenCV选择性搜索细分构建数据集
  2. Build a CNN for detecting the objects you wish to classify (in our case this will be 0 = No Weapon, 1 = Handgun, and 2 = Rifle)

    建立一个CNN以检测您想要分类的物体(在我们的例子中,它将是0 =无武器,1 =手枪和2 =步枪)
  3. Train the model on the images built from the selective search segmentation

    在通过选择性搜索细分构建的图像上训练模型
  4. When creating a bounding box for a new image, run the image through the selective search segmentation, then grab every piece of the picture.

    为新图像创建边界框时,请通过选择性搜索细分来运行图像,然后抓取每张图片。
  5. Run each piece of an image through the algorithm, and whenever the algorithm predicts the object you are looking for mark the locations with a bounding box

    通过算法运行每张图像,并且每当算法预测到您要寻找的对象时,便用边界框标记位置
  6. If multiple bounding boxes are marked, apply Non-Maxima suppression to include only the box with the high confidence/region of interest (this part I am still figuring out… you will see my issue below)

    如果标记了多个边界框,则应用“非最大值”抑制以仅包括具有高置信度/感兴趣区域的框(这部分我仍在弄清楚……您将在下面看到我的问题)

Below is a gif showing how the algorithm works. For a given image, each square will be fed into the neural network. If a square is predicted as positive (handgun or rifle), we will mark the area that we fed onto the original image.

以下是显示算法工作原理的Gif文件。 对于给定的图像,每个正方形将被馈入神经网络。 如果一个正方形被预测为阳性(手枪或步枪),我们将标记输入原始图像的区域。

Image for post
Sliding Window Approach: Object Detection (Image by Author)
滑动窗口方法:对象检测(作者提供的图像)

If you want to see the entire code for the project, visit my GitHub Repo where I explain the steps in greater depth.

如果您想查看该项目的全部代码,请访问我的GitHub Repo ,在其中我将更深​​入地解释这些步骤。

The data I linked above contains a lot of folders that I need to explain in order to understand whats going on. After unzipping the folder, these are the files & folders that are important for the project: AR, FinalImages, Labels, Pistol, Stock_AR, and Stock_Pistol, and PATHS.csv. Inside the folders, you will find the corresponding images pertaining to the folder name. So for the AR folder, you will find images of Assault rifles inside. Inside the Labels folder, you will see the .xml labels for all the images inside the class folders. Lastly, the PATHS.csv will point to every single image that will be used in the algorithm. For the purpose of this tutorial these are the only folders/files you need to worry about:

我上面链接的数据包含许多文件夹,我需要解释这些文件夹以了解发生了什么。 解压缩文件夹后,以下是对项目很重要的文件和文件夹:AR,FinalImages,Label,手枪,Stock_AR和Stock_Pistol,以及PATHS.csv。 在文件夹内,您将找到与文件夹名称有关的对应图像。 因此,对于AR文件夹,您将在其中找到突击步枪的图像。 在Labels文件夹内,您将看到class文件夹内所有图像的.xml标签。 最后,PATHS.csv将指向算法中将使用的每个图像。 就本教程而言,这些是您唯一需要担心的文件夹/文件:

  1. FinalImages/NoWeapon

    FinalImages / NoWeapon
  2. FinalImages/Pistol

    FinalImages /手枪
  3. FinalImages/Rifle

    FinalImages /步枪

The way the images within these folders were made is the following.

这些文件夹中图像的制作方式如下。

  • For every image with a bounding box, extract the bounding box and put it into its corresponding class folder. So for an image where a person is holding a pistol, the bounding box around the pistol will become positive, while every part outside the bounding box will become the negative (no weapon)

    对于每个带有边界框的图像,请提取边界框并将其放入其相应的类文件夹中。 因此,对于一个人握着手枪的图像,手枪周围的边界框将变为正,而边界框之外的每个部分都将变为负(无武器)
  • In the image below, imagine a bounding box around the image on the left. After extracting the pixels inside the bounding box (image on the right), we place that image to another folder (FinalImages/Pistol), while we place all the white space around the bounding box in the NoWeapons folder.

    在下图中,想象一下左侧图像周围的边界框。 提取边界框内的像素(右图)后,我们将该图像放置到另一个文件夹(FinalImages / Pistol)中,同时将所有空白留在NoWeapons文件夹中。
  • Although the image on the right looks like a resized version of the one on the left, it is really a segmented image. Picture a bounding box around the gun on the left. The image on the right is just the bounding box and nothing else (removing everything outside the box). This technique is called region of interest (ROI).

    尽管右侧的图像看起来像左侧图像的调整后的版本,但实际上是分段图像。 在左边的枪周围画一个边界框。 右边的图像只是边界框,没有其他内容(将框外的所有内容都删除了)。 此技术称为关注区域(ROI)。

Image for post
ROI Extraction (Image by Author)
ROI提取(作者提供的图像)

After gathering the dataset (which can be found inside Separated/FinalImages), we need to use these files for our algorithm, we need to prepare it in such a way where we have a list of RGB values and the corresponding label (0= No Weapon, 1 = Pistol, 2 = Rifle)

收集数据集(可以在Separated / FinalImages中找到)后,我们需要将这些文件用于我们的算法,我们需要以一种具有RGB值列表和相应标签(0 =否)的方式进行准备。武器,1 =手枪,2 =步枪)

If you run the code above with the Separated folder outside of the current directory, you will see a tqdm window that shows it is loading the images. After the process is finished, you should see this:

如果使用当前目录之外的Separated文件夹运行上面的代码,您将看到一个tqdm窗口,其中显示了该窗口正在加载图像。 该过程完成后,您应该看到以下内容:

Image for post
(Image by Author)
(图片由作者提供)

Now its time for the neural network. In the code below, the function will return a model given a dimension size. If you noticed in the code above, the dimensions for the photos were resized to (150, 150, 3). If you wish to use different dimensions just make sure you change the variable DIM above, as well as the dim in the function below

现在是时候使用神经网络了。 在下面的代码中,该函数将返回给定尺寸的模型。 如果您在上面的代码中注意到,则将照片的尺寸调整为(150,150,3)。 如果要使用其他尺寸,请确保更改上面的变量DIM以及下面函数中的暗淡

The model returned above will have the architecture shown below:

上面返回的模型将具有如下所示的体系结构:

Image for post
CNN Architecture (Image by Author)
CNN体系结构(作者提供)

Once we have our train and test sets, all we need to do is fit it onto our model. Running the code below will start the training process.

一旦有了训练和测试集,我们要做的就是将其拟合到模型中。 运行下面的代码将开始培训过程。

If you run the code without any errors, you should see a window like this:

如果您运行代码时没有任何错误,则应该看到如下所示的窗口:

Image for post

I want to note that I have the epochs set to 1000, but the EarlyStopping will prevent the algorithm from overfitting so it should not run for longer than 30–50 epochs. After the model is finished, you should see a .h5 file in your directory called ModelWeights.h5. This file is the weights that the model produced, so loading these into a model will load the model before it started to overfit.

我想指出的是,我将纪元设置为1000,但是EarlyStopping可以防止算法过度拟合,因此它的运行时间不应超过30–50个纪元。 模型完成后,您应该在目录中看到一个名为ModelWeights.h5的.h5文件。 该文件是模型产生的权重,因此将其加载到模型中将在模型开始过拟合之前加载模型。

Image for post
Accuracy of the Model (Image by Author)
模型的准确性(作者提供的图片)
Image for post
ROC For each Class (Image by Author)
ROC每个类别(作者提供的图片)

The accuracy was pretty good considering a balanced data set. Looking at the ROC curve, we can also assume pretty good classification given that the area under each class is very close to 1.

考虑到平衡的数据集,准确性非常好。 从ROC曲线来看,由于每个类别下的面积都非常接近1,因此我们也可以假设分类很好。

Now time for object detection! The following logic is used to create the bounding boxes:

现在是时候进行物体检测了! 以下逻辑用于创建边界框:

  1. Input an image or frame within a video and retrieve a base prediction

    在视频中输入图像或帧并检索基本预测
  2. Apply selective search segmentation to create hundreds or thousands of bounding box propositions

    应用选择性搜索细分以创建数百或数千个边界框命题
  3. Run each bounding box through the trained algorithm and retrieve the locations where the prediction is the same as the base predictions (in step 1)

    通过训练有素的算法运行每个边界框,并获取预测与基本预测相同的位置(在步骤1中)
  4. After retrieving the locations where the algorithm predicted the same as the base prediction, mark a bounding box on the location that was run through the algorithm

    检索算法预测的位置与基本预测相同的位置后,在通过算法运行的位置上标记一个边界框
  5. If multiple bounding boxes are chosen, apply non-maxima suppression to suppress all but one box, leaving the box with the highest probability and best Region of Interest (ROI)

    如果选择了多个边界框,则应用非最大值抑制来抑制除一个框以外的所有框,使该框具有最高的概率和最佳的关注区域(ROI)
  • Note: Non-maxima suppression is still a work in progress. In some instances, it can only detect features of the gun rather than the entire gun itself (see model comparisons below).

    注意:非最大值抑制仍在进行中。 在某些情况下,它只能检测到喷枪的特征,而不能检测整个喷枪本身的特征(请参阅下面的模型比较)。

Before you run the code above, create a folder Tests, and download any image from the internet and name the image the class for which you want to predict. Running the code above will search through every image inside the Tests folder and run that image through our object detection algorithm using the CNN we build above.

在运行上面的代码之前,创建一个文件夹Tests,然后从Internet下载任何图像,并将该图像命名为您要预测的类别。 运行上面的代码将搜索Tests文件夹内的所有图像,并使用我们在上面构建的CNN通过我们的对象检测算法运行该图像。

The images I tested on were the following:

我测试过的图像如下:

Image for post
Base Images
基本图片

After running the code above, these are the predictions the algorithm gave as an output.

在运行完上面的代码之后,这些就是算法作为输出给出的预测。

Image for post
here for larger image] 此处查看大图]

As you can see above, Non-maxima suppression is not perfect, but it does work in some sense. The issue I have here is that there are multiple bounding boxes with 100% confidence so it is hard to pick which one is the best. Also, the algorithm is unable to detect non-weapon when there is no weapon in the frame (sheep image).

正如您在上面看到的,非最大值抑制不是完美的,但是在某种意义上确实起作用。 我这里的问题是,有多个边界框具有100%的置信度,因此很难选择哪一个是最好的。 此外,当框架中没有武器(绵羊图像)时,该算法将无法检测到非武器。

Using the logic implemented above, here is a cool visual of where I apply the code to a video.

使用上面实现的逻辑,这是我将代码应用于视频的清晰画面。

Demo Weapon Detection (Video by Author)
演示武器检测(作者提供的视频)

Based on the examples above, we see that the algorithm is faaaar from perfect. This is okay because we still created a pretty cool model that only used 5000 images. Like I said earlier, I have a total of 120,000 images that I scraped from IMFDB.com, so this can only get better with more images we pass in during training.

根据上面的示例,我们发现该算法从完美到完美。 可以,因为我们仍然创建了一个非常酷的模型,仅使用了5000张图像。 就像我之前说的,我从IMFDB.com上总共刮取了120,000张图像,所以只有在训练过程中传递更多的图像,这种情况才能变得更好。

LIME:特征提取 (LIME: Feature Extraction)

One of the difficult parts of building and testing a neural network is that the way it works is basically a black box, meaning that you don't understand why the weights are what they are or what within the image the algorithm is using to make its predictions. Using LIME, we can better understand how our algorithm is performing and what within the picture is important for predictions.

构建和测试神经网络的困难部分之一是它的工作方式基本上是一个黑匣子,这意味着您不了解为什么权重是它们的权重,或者该算法在图像中使用的权重是什么预测。 使用LIME ,我们可以更好地了解我们的算法如何执行以及图片中的哪些内容对于预测至关重要。

Lime Predictions
石灰预测

Running the code above will create an image that looks like this:

运行上面的代码将创建一个如下所示的图像:

Image for post
LIME (Image by Author)
LIME(作者提供的图片)

The areas that are green are those that the algorithm deems “important”, while the opposite is true for the areas that are red. What we are seeing above is good considering we want the algorithm to detect features of the gun and not the hands or other portions of an image.

绿色的区域是算法认为“重要”的区域,而红色的区域则相反。 考虑到我们希望算法检测枪的特征而不是图像的手或其他部分,我们在上面看到的是好的。

Now that we can say we created our very own sentient being… it is time to get real for a second. The model we made is nothing compared to the tools that are already out there. This leads me to Transfer Learning…. where we see some really cool results. For the sake of this tutorial, I will not post the code here but you can find it on my GitHub Repo

现在我们可以说我们创造了我们自己的知性存在……是时候变得真实了。 与已经存在的工具相比,我们制作的模型算不上什么。 这使我进入了转移学习……。 我们看到了一些非常酷的结果。 为了本教程的缘故,我不会在此处发布代码,但是您可以在我的GitHub Repo上找到它

移动网 (Mobilenet)

  • In the example below, mobilenet was better at predicting objects that were not weapons and had bounding boxes around correct areas.

    在下面的示例中,mobilenet更好地预测了不是武器并且在正确区域周围具有边界框的对象。
Image for post
here for larger image] 这里查看大图]
Image for post
Mobilenet LIME (Image by Author)
Mobilenet LIME(作者提供的图片)

VGG16 (VGG16)

  • In the example below, VGG16 was unable to distinguish non-weapons like the architecture we built ourselves. It incorrectly classified 1 out of 3 handgun images, while correctly classifying the rest as a handgun

    在下面的示例中,VGG16无法区分非武器,例如我们自己构建的架构。 它错误地将3个手枪图像中的1个分类,而将其余图像正确地分类为手枪
  • Although it incorrectly classified a handgun as no weapon (4th to the right), the bounding boxes were not on the gun whatsoever as it stayed on the hand holding the gun.

    尽管它错误地将手枪归类为没有武器(右侧第4位),但包围盒不在枪上,因为它停留在握住枪的手上。
Image for post
here for larger image] 这里查看大图]
Image for post
VGG16: LIME (Image by Author)
VGG16:LIME(作者提供的图片)

结论 (Conclusion)

  • The goal of this project was to create an algorithm that can integrate itself into traditional surveillance systems and prevent a bad situation faster than a person would (considering the unfortunate circumstances in today’s society).

    该项目的目标是创建一种算法,该算法可以将其自身集成到传统的监视系统中,并能以比人更快的速度预防不良情况(考虑到当今社会的不幸情况)。
  • Although this was cool, the hardware in my computer is not yet there. To segment an image and process each portion of the image takes about 10–45 seconds, which is too slow for live video.

    尽管这很酷,但我的计算机中的硬件尚不存在。 分割图像并处理图像的每个部分大约需要10-45秒,对于实时视频来说这太慢了。
  • The video demonstration I showed above was a 30-second clip, and that took about 20 minutes to process.

    我在上面显示的视频演示是一个30秒的剪辑,需要大约20分钟的处理时间。
  • However, although live video is not feasible with an RX 580, using the new Nvidia GPU (3000 series) might have better results.

    但是,尽管使用RX 580进行实时视频不可行,但使用新的Nvidia GPU(3000系列)可能会有更好的效果。
  • Also, this technique can be used for retroactive examination of an event such as body cam footage or protests.

    而且,该技术可用于追溯检查诸如身体录像或抗议的事件。

**NOTE** If you want to follow along with the full project, visit my GitHub **

**注意**如果您想跟随整个项目,请访问我的GitHub **

翻译自: https://towardsdatascience.com/how-to-build-a-weapon-detection-system-using-keras-and-opencv-67b19234e3dd

opencv keras

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值