使用TensorFlow跟踪千年猎鹰

by Nick Bourdakos

由Nick Bourdakos

使用TensorFlow跟踪千年猎鹰 (Tracking the Millennium Falcon with TensorFlow)

At the time of writing this post, most of the big tech companies (such as IBM, Google, Microsoft, and Amazon) have easy-to-use visual recognition APIs. Some smaller companies also provide similar offerings, such as Clarifai. But none of them offer object detection.

在撰写本文时,大多数大型科技公司(例如IBM,Google,Microsoft和Amazon)都具有易于使用的视觉识别API。 一些较小的公司也提供类似的产品,例如Clarifai 。 但是它们都不提供对象检测。

Update: IBM and Microsoft now have customizable object detection APIs.

更新: IBM和Microsoft现在具有可自定义的对象检测API。

The following images were both tagged using the same Watson Visual Recognition default classifier. The first one, though, has been run through an object detection model first.

以下图像均使用相同的Watson Visual Recognition默认分类器标记。 不过,第一个已首先通过对象检测模型运行。

Object detection can be far superior to visual recognition on its own. But if you want object detection, you’re going to have to get your hands a little dirty.

对象检测本身可以远远超过视觉识别。 但是,如果要进行对象检测,就必须使手变脏。

Depending on your use case, you may not need a custom object detection model. TensorFlow’s object detection API provides a few models of varying speed and accuracy, that are based on the COCO dataset.

根据您的用例,您可能不需要自定义对象检测模型。 TensorFlow的对象检测API提供了一些基于COCO数据集的不同速度和精度的模型。

For your convenience, I have put together a complete list of objects that are detectable with the COCO models:

为了方便起见,我整理了可通过COCO模型检测到的对象的完整列表:

If you wanted to detect logos or something not on this list, you’d have to build your own custom object detector. I wanted to be able to detect the Millennium Falcon and some Tie Fighters. This is obviously an extremely important use case, because you never know…

如果要检测徽标或此列表中未包含的内容,则必须构建自己的自定义对象检测器。 我希望能够检测到“千年猎鹰”和一些“领带战斗机”。 这显然是一个极其重要的用例,因为您永远不知道……

注释图像 (Annotate your images)

Training your own model is a lot of work. At this point, you may be thinking, “Whoa, whoa, whoa! I don’t want to do a lot of work!” If so, you can check out my other article about using the provided model. It’s a much smoother ride.

训练自己的模型是很多工作。 此时,您可能会想,“哇,哇,哇! 我不想做很多工作!” 如果是这样,您可以查看有关使用提供的模型的其他文章 。 这是一个更加平稳的旅程。

You need to collect a lot of images and annotate them all. Annotations include specifying the object coordinates and a corresponding label. For an image with two Tie Fighters, an annotation might look something like this:

您需要收集大量图像并将其全部注释。 注释包括指定对象坐标和相应的标签。 对于带有两个“打架战斗机”的图像,注释可能看起来像这样:

<annotation>    <folder>images</folder>    <filename>image1.jpg</filename>    <size>        <width>1000</width>        <height>563</height>    </size>    <segmented>0</segmented>    <object>        <name>Tie Fighter</name>        <bndbox>            <xmin>112</xmin>            <ymin>281</ymin>            <xmax>122</xmax>            <ymax>291</ymax>        </bndbox>    </object>    <object>        <name>Tie Fighter</name>        <bndbox>            <xmin>87</xmin>            <ymin>260</ymin>            <xmax>95</xmax>            <ymax>268</ymax>        </bndbox>    </object></annotation>

For my Star Wars model, I collected 308 images including two or three objects in each. I’d recommend trying to find 200–300 examples of each object.

对于我的《星球大战》模型,我收集了308张图像,每个图像中包含两个或三个对象。 我建议尝试查找每个对象的200–300个示例。

“Wow,” you might be thinking, “I have to go through hundreds of images and write a bunch of XML for each one?”

“哇,”您可能会想,“我必须浏览数百张图像,并为每个图像编写一堆XML吗?”

Of course not! There are plenty of annotation tools out there, such as labelImg and RectLabel. I used RectLabel, but it’s only for macOS. It’s still a lot of work, trust me. It took me about three or four hours of nonstop work to annotate my entire dataset.

当然不是! 有很多注释工具,例如labelImgRectLabel 。 我使用了RectLabel ,但这仅用于macOS。 相信我,还有很多工作要做。 我花了大约三四个小时的不间断工作来注释我的整个数据集。

Update: I ended up building my own tool to annotate images and video frames. It’s a free online tool called Cloud Annotations that you can check out here.

更新:我最终构建了自己的工具来注释图像和视频帧。 这是一个名为Cloud Annotations的免费在线工具,您可以在此处查看

If you have the money, you can pay somebody else, like an intern, to do it. Or you can use something like Mechanical Turk. If you are a broke college student like me and/or find doing hours of monotonous work fun, you’re on your own.

如果有钱,您可以付钱给其他人,例如实习生。 或者,您可以使用Mechanical Turk之类的工具 。 如果您是像我这样的大学生,并且/或者发现工作时间很单调,那您就该靠自己了。

We will need to do a little setup before we can run the script to prepare the data for TensorFlow.

在运行脚本为TensorFlow准备数据之前,我们需要做一些设置。

克隆仓库 (Clone the repo)

Start by cloning my repo here.

这里克隆我的仓库开始

Update: This repo is a bit out of date, I recommend checking out this one for a much better time :)

更新:此仓库有点过时了,我建议您检查一下此仓库,以获得更好的时间:)

Note: The following instructions are also out of date, I urge you to check out the new walkthrough.

注意:以下说明也已过时,我敦促您检查新的演练

The directory structure will need to look like this:

目录结构将如下所示:

models|-- annotations|   |-- label_map.pbtxt|   |-- trainval.txt|   `-- xmls|       |-- 1.xml|       |-- 2.xml|       |-- 3.xml|       `-- ...|-- images|   |-- 1.jpg|   |-- 2.jpg|   |-- 3.jpg|   `-- ...|-- object_detection|   `-- ...`-- ...

I’ve included my training data, so you should be able to run this out of the box. But if you want to create a model with your own data, you’ll need to add your training images to images, add your XML annotations to annotations/xmls, update trainval.txt, and label_map.pbtxt.

我已经包含了我的训练数据,因此您应该可以立即使用它。 但是,如果要使用自己的数据创建模型,则需要将训练图像添加到images ,将XML注释添加到annotations/xmls ,更新trainval.txtlabel_map.pbtxt

trainval.txt is a list of file names that allows us to find and correlate the JPG and XML files. The following trainval.txt list would let us to find abc.jpg, abc.xml, 123.jpg, 123.xml, xyz.jpg and xyz.xml:

trainval.txt是文件名列表,它使我们能够查找和关联JPG和XML文件。 下面的trainval.txt列表使我们可以找到abc.jpgabc.xml123.jpg123.xmlxyz.jpgxyz.xml

abc123xyz

Note: Make sure your JPG and XML file names match, minus the extension.

注意:确保您的JPG和XML文件名匹配,减去扩展名。

label_map.pbtxt is our list of objects that we are trying to detect. It should look something like this:

label_map.pbtxt是我们尝试检测的对象列表。 它看起来应该像这样:

item {  id: 1  name: 'Millennium Falcon'}
item {  id: 2  name: 'Tie Fighter'}

运行脚本 (Running the script)

First, with Python and pip installed, install the scripts requirements:

首先,在安装了Python和pip的情况下,安装脚本要求:

pip install -r requirements.txt

Add models and models/slim to your PYTHONPATH:

modelsmodels/slim添加到您的PYTHONPATH

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Important Note: This must be run every time you open the terminal, or added to your ~/.bashrc file.

重要说明:此操作必须在每次打开终端时运行,或添加到~/.bashrc文件中。

Run the script:

运行脚本:

python object_detection/create_tf_record.py

Once the script finishes running, you will end up with a train.record and a val.record file. This is what we will use to train the model.

一旦脚本运行完毕,您将得到一个train.recordval.record文件。 这就是我们将用来训练模型的东西。

下载基本模型 (Downloading a base model)

Training an object detector from scratch can take days, even when using multiple GPUs. To speed up training, we’ll take an object detector trained on a different dataset and reuse some of it’s parameters to initialize our new model.

即使使用多个GPU ,从头开始训练对象检测器也可能需要几天的时间。 为了加快训练速度,我们将训练在不同数据集上训练的对象检测器,并重用其中的一些参数来初始化新模型。

You can download a model from this model zoo. Each model varies in accuracy and speed. I used faster_rcnn_resnet101_coco.

您可以从该模型动物园下载模型 。 每个模型的准确性和速度各不相同。 我使用了faster_rcnn_resnet101_coco

Extract and move all the model.ckpt files to our repo’s root directory.

提取所有model.ckpt文件并将其移动到我们model.ckpt的根目录。

You should see a file named faster_rcnn_resnet101.config. It’s set to work with the faster_rcnn_resnet101_coco model. If you used another model, you can find a corresponding config file here.

您应该看到一个名为faster_rcnn_resnet101.config的文件。 设置为与faster_rcnn_resnet101_coco模型一起使用。 如果使用其他模型,则可以在此处找到相应的配置文件。

准备训练 (Ready to train)

Run the following script, and it should start to train!

运行以下脚本,它应该开始训练!

python object_detection/train.py \        --logtostderr \        --train_dir=train \        --pipeline_config_path=faster_rcnn_resnet101.config

Note: Replace pipeline_config_path with the location of your config file.

注意:用配置文件的位置替换pipeline_config_path

global step 1:global step 2:global step 3:global step 4:...

Yay! It’s working!

好极了! 工作正常!

10 minutes later.

10分钟后。

global step 41:global step 42:global step 43:global step 44:...

Computer starts smoking.

电脑开始吸烟。

global step 71:global step 72:global step 73:global step 74:...

How long is this thing supposed to run?

这个东西应该运行多久?

The model that I used in the video ran for about 22,000 steps.

我在视频中使用的模型运行了大约22,000个步骤。

Wait, what?!

等等,什么?

I use a specked-out MacBook Pro. If you’re running this on something similar, I’ll assume you’re getting about one step every 15 seconds or so. At that rate it will take about three to four days of nonstop running to get a decent model.

我使用精选的MacBook Pro。 如果您以类似的方式运行此程序,那么我假设您每15秒钟左右就会获得一个步骤。 以这种速度,要获得一个不错的模型,大约需要三到四天不间断运行。

Well, this is dumb. I don’t have time for this ?

好吧,这真是愚蠢。 我没有时间吗?

PowerAI to the rescue!

PowerAI进行救援!

PowerAI (PowerAI)

PowerAI lets us train our model on IBM Power Systems with P100 GPUs fast!

PowerAI使我们能够在带有P100 GPU的IBM Power Systems上快速训练我们的模型!

It only took about an hour to train for 10,000 steps. However, this was just with one GPU. The real power in PowerAI comes from the the ability to do distributed deep learning across hundreds of GPUs with up to 95% efficiency.

训练一万步只花了一个小时。 但是,这只是一个GPU。 PowerAI的真正功能来自能够在多达数百个GPU中进行分布式深度学习的功能,效率高达95%。

With the help of PowerAI, IBM just set a new image recognition record of 33.8% accuracy in 7 hours. It surpassed the previous industry record set by Microsoft — 29.9% accuracy in 10 days.

在PowerAI的帮助下,IBM刚刚在7小时内创造了33.8%的准确度的新图像识别记录。 它超过了Microsoft先前的行业记录-在10天之内达到29.9%的准确性。

WAYYY fast!

WAYYY快!

Since I’m not training on millions of images, I definitely didn’t need those kind of resources. One GPU will do.

由于我没有训练成千上万张图像,因此我绝对不需要这些资源。 一个GPU即可。

创建一个Nimbix帐户 (Creating a Nimbix account)

Nimbix provides developers a trial account with ten hours of free processing time on the PowerAI platform. You can register here.

Nimbix在PowerAI平台上为开发人员提供了10个小时的免费处理时间的试用帐户。 您可以在这里注册。

Note: This process is not automated, so it may take up to 24 hours to be reviewed and approved.

注意:此过程不是自动执行的,因此最多可能需要24小时才能进行审核和批准。

Once approved, you should receive an email with instructions on confirming and creating your account. It will ask you for a promotional code, but leave it blank.

一旦获得批准,您将收到一封电子邮件,其中包含有关确认和创建帐户的说明。 它将要求您输入促销代码,但将其留空。

You should now be able to log in here.

您现在应该可以在这里登录。

部署PowerAI Notebook应用程序 (Deploy the PowerAI Notebooks application)

Start by searching for PowerAI Notebooks.

首先搜索PowerAI Notebooks

Click on it, and then choose TensorFlow.

单击它,然后选择TensorFlow

Choose the machine type of 32 thread POWER8, 128GB RAM, 1x P100 GPU w/NVLink (np8g1).

选择32 thread POWER8, 128GB RAM, 1x P100 GPU w/NVLink (np8g1)的机器类型。

Once started, the following dashboard panel will be displayed. When the server Status turns to Processing, the server is ready to be accessed.

启动后,将显示以下仪表板面板。 服务器Status变为Processing ,就可以访问服务器了。

Get the password by clicking on (click to show).

(click to show)获取密码(click to show)

Then, click Click here to connect to launch the Notebook.

然后,单击“ Click here to connect以启动笔记本。

Log-in using the user name nimbix and the previously supplied password.

使用用户名nimbix和先前提供的密码登录。

开始训练 (Start training)

Get a new terminal window by clicking on the New pull-down and selecting Terminal.

单击“ New下拉菜单并选择“ Terminal ,以获取一个新的终端窗口。

You should be greeted with a familiar face:

您应该以熟悉的面Kong迎接您:

Note: Terminal may not work in Safari.

注意:终端可能无法在Safari中使用。

The steps for training are the same as they were when we ran this locally. If you’re using my training data then you can just clone my repo by running (if not, just clone your own repo):

培训的步骤与在本地运行时的步骤相同。 如果您使用的是我的训练数据,则可以通过运行克隆我的仓库(如果没有,只需克隆自己的仓库):

git clone https://github.com/bourdakos1/Custom-Object-Detection.git

Then cd into the root directory:

然后cd进入根目录:

cd Custom-Object-Detection

Run this snippet, which will download the pre-trained faster_rcnn_resnet101_coco model we downloaded earlier.

运行此代码片段,它将下载我们先前下载的预先训练的faster_rcnn_resnet101_coco模型。

wget http://storage.googleapis.com/download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_11_06_2017.tar.gztar -xvf faster_rcnn_resnet101_coco_11_06_2017.tar.gzmv faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.* .

Then we need to update our PYTHONPATH again, because this in a new terminal:

然后,我们需要再次更新PYTHONPATH ,因为这是在新终端中进行的:

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Then we can finally run the training command again:

然后,我们终于可以再次运行训练命令:

python object_detection/train.py \        --logtostderr \        --train_dir=train \        --pipeline_config_path=faster_rcnn_resnet101.config
下载模型 (Downloading your model)

When is my model ready? It depends on your training data. The more data, the more steps you’ll need. My model was pretty solid at nearly 4,500 steps. Then, at about 20,000 steps, it peaked. I even went on and trained it for 200,000 steps, but it didn’t get any better.

我的模型什么时候准备好? 这取决于您的训练数据。 数据越多,所需步骤越多。 我的模型在接近4,500步的情况下非常可靠。 然后,在大约20,000步处达到峰值。 我什至继续进行了200,000步的训练,但并没有得到任何改善。

I recommend downloading your model every 5,000 steps or so and evaluating it to make sure you’re on the right path.

我建议每5,000步左右下载一次模型,并对其进行评估,以确保您走的路正确。

Click on the Jupyter logo in the top left corner. Then, navigate the file tree to Custom-Object-Detection/train.

单击左上角的Jupyter徽标。 然后,将文件树导航到Custom-Object-Detection/train

Download all the model.ckpt files with the highest number.

下载编号最大的所有model.ckpt文件。

  • model.ckpt-STEP_NUMBER.data-00000-of-00001

    model.ckpt-STEP_NUMBER.data-00000-of-00001

  • model.ckpt-STEP_NUMBER.index

    model.ckpt-STEP_NUMBER.index

  • model.ckpt-STEP_NUMBER.meta

    model.ckpt-STEP_NUMBER.meta

Note: You can only download one at a time.

注意:一次只能下载一个。

Note: Be sure to click the red power button on your machine when finished. Otherwise, the clock will keep on ticking indefinitely.

注意:完成后,请确保单击机器上的红色电源按钮。 否则,时钟将无限期地滴答滴答。

导出推理图 (Export the inference graph)

To use the model in our code, we need to convert the checkpoint files (model.ckpt-STEP_NUMBER.*) into a frozen inference graph.

要在我们的代码中使用模型,我们需要将检查点文件( model.ckpt-STEP_NUMBER.* )转换为冻结的推理图

Move the checkpoint files you just downloaded into the root folder of the repo you’ve been using.

将您刚刚下载的检查点文件移到您一直在使用的仓库的根文件夹中。

Then run this command:

然后运行以下命令:

python object_detection/export_inference_graph.py \        --input_type image_tensor \        --pipeline_config_path faster_rcnn_resnet101.config \        --trained_checkpoint_prefix model.ckpt-STEP_NUMBER \        --output_directory output_inference_graph

Remember export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim .

记住export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

You should see a new output_inference_graph directory with a frozen_inference_graph.pb file. This is the file we need.

您应该看到一个新的output_inference_graph目录,其中包含frozen_inference_graph.pb文件。 这是我们需要的文件。

测试模型 (Test the model)

Now, run the following command:

现在,运行以下命令:

python object_detection/object_detection_runner.py

It will run your object detection model found at output_inference_graph/frozen_inference_graph.pb on all the images in the test_images directory and output the results in the output/test_images directory.

它将对test_images目录中的所有图像运行在output_inference_graph/frozen_inference_graph.pb上找到的对象检测模型, output_inference_graph/frozen_inference_graph.pb结果output/test_imagesoutput/test_images目录中。

结果 (The results)

Here’s what we get when we run our model over all the frames in this clip from Star Wars: The Force Awakens.

这是当我们在《星球大战:原力觉醒》中的剪辑中的所有帧上运行模型时得到的结果。

Thanks for reading! If you have any questions, feel free to reach out at bourdakos1@gmail.com, connect with me on LinkedIn, or follow me on Medium and Twitter.

谢谢阅读! 如果您有任何疑问,请随时与我们联系bourdakos1@gmail.com,在LinkedIn上与我联系,或在MediumTwitter上关注我。

If you found this article helpful, it would mean a lot if you gave it some applause? and shared to help others find it! And feel free to leave a comment below.

如果您觉得这篇文章很有帮助,那么如果您鼓掌鼓掌,那意义重大吗? 并分享以帮助他人找到它! 并随时在下面发表评论。

翻译自: https://www.freecodecamp.org/news/tracking-the-millenium-falcon-with-tensorflow-c8c86419225e/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值