rasp agent_Rasp Pi上的Perf机器学习

rasp agent

Raspberry Pi上的3种机器学习框架 (3 Frameworks for Machine Learning on the Raspberry Pi)

The revolution of AI is reaching new heights through new mediums. We’re all enjoying new tools on the edge, but what are they? What products frameworks will fuel the inventions of tomorrow?

AI的革命正在通过新的媒介达到新的高度。 我们都在享受边缘的新工具,但是它们又是什么呢? 什么样的产品框架将推动明天的发明?

If you’re unfamiliar with why Machine Learning is changing our lives, have a read here.

如果您不熟悉机器学习为何改变了我们的生活,请在这里阅读。

If you’re already excited about Machine Learning and you’re interested in utilizing it on devices like the Raspberry Pi, enjoy!

如果您已经对机器学习感到兴奋,并且对在Raspberry Pi等设备上使用它感兴趣,请尽情享受!

Raspberry Pi上的简单对象检测 (Simple object detection on the Raspberry Pi)

I’ve implemented three different tools for detection on the Pi camera. While it’s a modern miracle that all three work, it’s important for creators to know “how well” because of #perfmatters.

我已经实现了三种不同的工具来在Pi相机上进行检测。 虽然这三个作品都是现代的奇迹,但重要的是创作者必须了解#perfmatters的“表现”。

Our three contenders are as follows:

我们的三个竞争者如下:

  1. Vanilla Raspberry Pi 3 B+— No optimizations, but just using a TensorFlow framework on the device for simple recognition.

    Vanilla Raspberry Pi 3 B + —无需优化,仅在设备上使用TensorFlow框架即可轻松识别。

  2. Intel’s Neural Compute Stick 2 — Intel’s latest USB interface device for Neural Networks, boasting 8x perf over the first stick! Around $80 USD.

    英特尔的神经计算棒2 —英特尔用于神经网络的最新USB接口设备,在第一根棒上的性能提高了8倍! 约$ 80美元。

  3. Xnor.ai — A proprietary framework that reconfigures your model to run efficiently on smaller hardware. Xnor’s binary logic shrinks 32-bit floats to 1-bit operations, allowing you to optimize deep learning models for simple devices.

    Xnor.ai —一种专有框架,可重新配置模型以在较小的硬件上有效运行。 Xnor的二进制逻辑将32位浮点数缩减为1位运算,从而使您可以为简单设备优化深度学习模型。

Let’s evaluate all three with simple object detection on a camera!

让我们通过在相机上进行简单的物体检测来评估这三个!

香草树莓派3 B + (Vanilla Raspberry Pi 3 B+)

A Raspberry Pi is like a small, wimpy, Linux machine for $40. It allows you to run high-level applications and code on devices like IoT made easy. Though it sounds like I can basically use laptop machine learning on the device, there’s one big gotcha. The RPi has an ARM processor, and that means we’ll need to recompile our framework, i.e. TensorFlow, to get everything running.

Raspberry Pi就像一台小型的,imp弱的Linux机器,售价40美元。 它使您可以轻松地在IoT等设备上运行高级应用程序和代码。 尽管听起来我基本上可以在设备上使用笔记本电脑机器学习,但还是有一个大陷阱。 RPi具有ARM处理器 ,这意味着我们需要重新编译我们的框架(即TensorFlow),以使所有内容运行。

⚠️ While this is not hard, this is SLOW. Expect this to take a very… very… long time. This is pretty much the fate of anything compiled on the Raspberry Pi.
While️这并不难,但是很慢。 预计这将花费很长时间。 这几乎就是在Raspberry Pi上编译的任何东西的命运。
建立 (Setup)

Here are all the steps I did, including setting up the Pi camera for object detection. I'm simply including this for posterity. Feel free to skip reading it.

这是我所做的所有步骤,包括设置Pi相机进行物体检测。 我只是为了后代而包括在内。 随时跳过阅读。

Install pi, then camera, then edit the /boot/config.txt Add disable_camera_led=1 to the bottom of the file and rebooting.

安装pi,然后安装相机,然后编辑/boot/config.txtdisable_camera_led=1添加到文件底部,然后重新启动。

最好禁用屏幕保护程序模式,因为一些后续命令可能需要几个小时 (Best to disable screensaver mode, as some follow-up commands may take hours)

sudo apt-get install xscreensaver
xscreensaver

Then disable screen saver in the “Display Mode” tab.

然后在“显示模式”选项卡中禁用屏幕保护程序。

现在安装Tensorflow (Now get Tensorflow Installed)

sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get update
sudo apt-get install libatlas-base-dev
sudo apt-get install libjasper-dev libqtgui4 python3-pyqt5
pip3 install tensorflow
sudo apt-get install libjpeg-dev zlib1g-dev libxml2-dev libxslt1-dev
pip3 install pillow jupyter matplotlib cython
pip3 install lxml # this one takes a long time
pip3 install python-tk

OpenCV (OpenCV)

sudo apt-get install libtiff5-dev libjasper-dev libpng12-dev
Sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
sudo apt-get install libxvidcore-dev libx264-dev
sudo apt-get install qt4-dev-tools
pip3 install opencv-python

安装Protobuff (Install Protobuff)

sudo apt-get install autoconf automake libtool curl

Then pull down protobuff and untar it. https://github.com/protocolbuffers/protobuf/releases

然后拉下probuff,然后将其松开。 https://github.com/protocolbuffers/protobuf/releases

Then cd in and then run the following command which might cause the computer to become unusable for the next 2+ hours. Use ctrl + alt + F1, to move to terminal only and release all UI RAM. Close x process with control + c if needed. You can then run the long-running command. Base username “pi” and password “raspberry”

然后输入cd,然后运行以下命令,这可能会导致计算机在接下来的2个多小时内无法使用。 使用ctrl + alt + F1键,仅移至终端并释放所有UI RAM。 如果需要,用控件+ c关闭x进程。 然后,您可以运行长时间运行的命令。 基本用户名“ pi”和密码“ raspberry”

make && make check

You can then install simply with

然后,您可以简单地使用

sudo make install
cd python
export LD_LIBRARY_PATH=../src/.libs
python3 setup.py build --cpp_implementation
python3 setup.py test --cpp_implementation
sudo python3 setup.py install --cpp_implementation
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION=3
sudo ldconfig

Once this is done, you can clean up some install crud with sudo apt-get autoremove, delete the tar.gz download and then finally reboot with sudo reboot now which will return you to a windowed interface

完成此操作后,您可以使用sudo apt-get autoremove清理一些安装程序,删除tar.gz下载,然后立即通过sudo reboot重新启动,这将使您返回到窗口界面

设置Tensorflow (Setup Tensorflow)

mkdir tensorflow1 && cd tesorflow1
git clone --recurse-submodules \ https://github.com/tensorflow/models.git
modify ~/.bashrc to contain new env var named PYTHONPATH as such
export PYTHONPATH=$PYTHONPATH:/home/pi/tensorflow1/models/research:/home/pi/tensorflow1/models/research/slim

Now go to the zoo: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md We’ll take the ssdlite_mobilenet, which is the fastest! Wget the file and then tar -xzvf the tar.gz result and delete the archive once untarred. Do this in the object_detection folder in your local tensorflow1 folder. Now cd up to the research dir. Then run:

现在去动物园: https : //github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md我们将使用最快的ssdlite_mobilenet! 先获取文件,然后获取tar.gz结果,然后tar -xzvf并解压缩存档。 在本地tensorflow1文件夹中的object_detection文件夹中执行此操作。 现在cd到研究目录。 然后运行:

protoc object_detection/protos/*.proto --python_out=.

This converted the object detection protos files to python in the proto folder

这会将对象检测protos文件转换为proto文件夹中的python

完成安装! (Done Installing!!)

Special thanks to Edje Electronics for sharing their wisdom on setup, an indispensable resource for my own setup and code.

特别感谢Edje Electronics分享了他们在设置方面的智慧,这是我自己的设置和代码必不可少的资源。

Once I got Tensorflow running, I was able to run object recognition (with the provided sample code) on Mobilenet for 1 to 3 frames per second.

一旦运行Tensorflow,我就可以在Mobilenet上以每秒1到3帧的速度运行对象识别(使用提供的示例代码)。

香草派结果 (Vanilla Pi Results)

For basic detection, 1 to 3 frames per second aren’t bad. Removing the GUI or lowering camera input quality speeds up detection. This means the tool could be an excellent detector for just simple detection. What a great baseline! Let’s see if we can make it better with the tools available.

对于基本检测,每秒1到3帧还不错。 删除GUI或降低摄像机的输入质量可加快检测速度。 这意味着该工具可能是仅用于简单检测的出色检测器。 多么棒的基线! 让我们看看是否可以使用可用工具使其更好。

英特尔的神经计算棒2 (Intel’s Neural Compute Stick 2)

This concept excites me. For those of us without GPUs readily available, training on the edge instead of the cloud, and moving that intense speed to the Raspberry Pi is just exciting. I missed the original stick, the “Movidius”, but from this graph, it looks like I chose a great time to buy!

这个概念使我兴奋。 对于我们这些没有GPU的人来说,在边缘而不是在云上进行训练,并将如此之快的速度转移到Raspberry Pi上,真是令人兴奋。 我错过了原始的摇杆“ Movidius”,但从这张图看来,我似乎选择了购买的好时机!

建立 (Setup)

My Intel NCS2 arrived quickly and I enjoyed unboxing actual hardware for accelerating my training. That was probably the last moment I was excited.

我的Intel NCS2很快就到了,我喜欢拆箱实际硬件以加快培训速度。 那可能是我激动的最后一刻。

Firstly, the USB takes a lot of space. You’ll want to get a cable to keep it away from the base.

首先,USB占用大量空间。 您将需要一根电缆以使其远离底座。

That’s a little annoying but fine. The really annoying part was trying to get my NCS 2 working.

有点烦,但还可以。 真正令人讨厌的部分是尝试使我的NCS 2工作。

There are lots of tutorials for the NCS by third parties, and following them got me to a point where I thought the USB stick might be broken!

第三方提供了许多有关NCS的教程,在这些教程之后,我意识到USB棒可能坏了!

Everything I found on the NCS didn’t work (telling me the stick wasn’t plugged in!), and everything I found on NCS2 was pretty confusing. For a while, NCS2 didn’t even work on ARM processors!

我在NCS上发现的所有内容均无法正常运行(告诉我未插入棍子!),而在NCS2上发现的所有内容均令人困惑。 有一段时间,NCS2甚至无法在ARM处理器上运行!

?????????????????
??????????????????

After a lot of false-trails, I finally found and began compiling C++ examples (sorry Python) that only understood USB cameras (sorry PiCam). Compiling the examples was painful. Often the entire Raspberry Pi would become unusable, and I’d have to reboot.

经过许多错误的尝试,我终于找到并开始编译仅了解USB相机的C ++示例(对不起Python)(对不起PiCam)。 编写示例很痛苦。 通常,整个Raspberry Pi都将变得不可用,我必须重新启动。

The whole onboarding experience was more painful than recompiling Tensorflow on the raw Pi. Fortunately, I got everything working!

整个入门过程比在原始Pi上重新编译Tensorflow更加痛苦。 幸运的是,我一切正常!

The result!? ??????????????????????

结果!? ??????????????????????

NC2棒结果 (NC2 Stick Results)

6 to 8 frames per second… ARE YOU SERIOUS!? After all that?

每秒6到8帧...您感觉很紧张吗? 毕竟呢?

It must be a mistake, let me run the perfcheck project.

一定是错误的,让我运行perfcheck项目。

10 frames per second…

每秒10帧…

From videos on the original NCS on python I saw around 10fps.. where’s the 8x boost? Where’s the reason for $80 hardware attached to a $40 device? To say I was let down by Intel’s NCS2 is an understatement. The user experience and final results were frustrating, to put it lightly.

从Python上原始NCS的视频中,我看到了大约10fps。8倍增强在哪里? 将$ 80的硬件连接到$ 40的设备的原因是什么? 要说我对英特尔的NCS2感到失望是轻描淡写。 简而言之,用户体验和最终结果令人沮丧。

Xnor.ai (Xnor.ai)

Xnor.ai is a self-contained software solution for deploying fast and accurate deep learning models to low-cost devices. As many discrete logic enthusiasts might have noticed, Xnor is the logical complement of the bitwise XOR operator. If that doesn’t mean anything to you, that’s fine. Just know that the people who created the YOLO algorithm are alluding to the use of the logical operator to compress complex 32-bit computations down to 1-bit by utilizing this inexpensive operation and keeping track of the CPU stack.

Xnor.ai是一个自包含的软件解决方案,用于将快速,准确的深度学习模型部署到低成本设备。 正如许多离散逻辑爱好者可能已经注意到的那样,Xnor是按位XOR运算符的逻辑补充。 如果那对您没有任何意义,那很好。 只是知道,创建YOLO算法的人都在暗示使用逻辑运算符,通过利用这种廉价的操作并跟踪CPU堆栈,可以将复杂的32位计算压缩到1位。

In theory, avoiding such complex calculations required by GPUs should speed up execution on edge devices. Let’s see if it works!

从理论上讲,避免GPU进行如此复杂的计算将加快边缘设备的执行速度。 让我们看看它是否有效!

建立 (Setup)

Setup was insanely easy. I had an object detection demo up and running in 5 minutes. 5 MINUTES!

安装非常简单。 我有一个对象检测演示程序,并在5分钟内运行。 5分钟!

The trick with Xnor.ai is that, much like the NCS2 Stick, the model is modified and optimized for the underlying hardware fabric. Unlike Intel’s haphazard setup, everything is wrapped in friendly Python (or C) code.

Xnor.ai的窍门是,就像NCS2 Stick一样,该模型针对基础硬件结构进行了修改和优化。 与英特尔的随意设置不同,所有内容都包装在友好的Python(或C)代码中。

model = xnornet.Model.load_built_in()

model = xnornet.Model.load_built_in()

That’s nice and simple.

很好,很简单。

But it means nothing if the performance isn’t there. Let’s load their object detection model.

但是,如果没有性能,那将毫无意义。 让我们加载他们的对象检测模型。

Again, no complexity, they have one with no overlay, and one with. Since the others (except for perfcheck on NCS2) were with overlays, let’s use that.

再说一次,没有复杂性,它们有一个没有覆盖,有一个没有覆盖。 由于其他(NCS2上的perfcheck除外)都带有覆盖层,因此让我们使用它。

Xnor.ai结果 (Xnor.ai Results)

JAW… DROPPING… PERFORMANCE. I not only get a stat on how fast inference could work, but I also get an overall FPS with my overlay that blew everything else out of the water.

颚……滴下……性能。 我不仅获得了有关推论工作原理的统计信息,而且还获得了覆盖层使所有其他内容完全消失的总体FPS。

OVER 12FPS and an inference speed over 34FPS!?

超过12FPS,推理速度超过34FPS!

This amazing throughput is achieved with no extra hardware purchase!? I’d call Xnor the winner at this point, but it seems a little too obvious.

无需购买额外的硬件即可实现惊人的吞吐量! 在这一点上,我将称呼Xnor为赢家,但这似乎太明显了。

I was able to heat up my device and open a browser in the background to get it down to 8+ FPS, but even then, it’s a clear winner!

我能够加热设备并在后台打开浏览器将其降至8+ FPS,但即使如此,它还是一个明显的赢家!

The only negative I can give you on Xnor.ai is that I have no idea how much it costs. The Evaluation model has a limit of 13,500 inferences per startup.

我可以在Xnor.ai上给您带来的唯一缺点是,我不知道它要花多少钱。 评估模型每次启动的推理限制为13,500。

While emailing them to get pricing, they are just breaking into non-commercial use, so they haven’t created a pricing system yet. Fortunately, the evaluation model would be fine for most hobbyists and prototypes.

通过电子邮件向他们发送定价信息时,它们只是进入非商业用途,因此尚未建立定价系统。 幸运的是,该评估模型对于大多数爱好者和原型都很好。

综上所述: (In Summary:)

If you need to take a variety of models into account, you might be just fine getting your Raspberry Pi setup from scratch. This would make it a great resource for testing new models and really customize your experience.

如果您需要考虑多种型号,那么从头开始安装Raspberry Pi可能就很好了。 这将使其成为测试新模型和真正定制您的体验的理想资源。

When you’re ready to ship, it’s no doubt that both the NCS2 and the Xnor.ai frameworks speed things up. It’s also no doubt that Xnor.ai outperformed the NCS2 in both onboarding and performance. I’m not sure what Xnor.ai’s pricing model is, but that would be the final factor in what is clearly a superior framework.

当您准备好发货时,毫无疑问,NCS2和Xnor.ai框架都可以加快速度。 毫无疑问,Xnor.ai在入门和性能上都优于NCS2。 我不确定Xnor.ai的定价模型是什么,但这将是显然是高级框架的最终因素。

发布发布更新: (Post Publish Updates:)

This is an excellent blog post on setting up the NCS2

这是有关设置NCS2的出色博客文章

Getting Started with the Intel Neural Compute Stick 2 and the Raspberry PiGetting started with Intel’s Movidius hardwaremedium.com

英特尔神经计算棒2和Raspberry Pi 入门英特尔Movidius硬件 medium.com入门

Additionally, if you’re looking to play around with Xnor.ai, the link is www.xnor.ai/ai2go

此外,如果您想使用Xnor.ai,链接为www.xnor.ai/ai2go

Gant Laborde is Chief Technology Strategist at Infinite Red, a published author, adjunct professor, worldwide public speaker, and mad scientist in training. Clap/follow/tweet or visit him at a conference.

甘特·劳德 ( Gant Laborde)Infinite Red的首席技术策略师,已出版的作者,兼职教授,全球演讲者和训练中的疯狂科学家。 拍手/跟随/ 发推文在会议上拜访他。

Expect more awesome edge blog posts coming soon!

期待更多精彩的边缘博客文章即将发布!

有空吗 阅读更多Gant (Have a moment? Read more by Gant)

Avoid Nightmares — NSFW JSClient-side indecent content checking for the soulshift.infinite.red

避免噩梦— NSFW JS 客户端不雅内容检查是否发生了 shift.infinite.red

翻译自: https://www.freecodecamp.org/news/perf-machine-learning-on-rasp-pi-51101d03dba2/

rasp agent

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值